return 0;

Improving C

C is my favorite programming language, but it would be more accurate to describe it as "the language I hate the least". This page has some changes to C that would allow me to confidently call it my favorite.

A lot of people have tried to "fix C" by making another language, but all of them (that I've seen) change too much from what kind of language C is. It's fine to make something different, but then you're not "fixing C" anymore, you're making a different language. I think that C almost all good, there's just some nuances that make it unnecessarily clumsy or annoying, and most of those could be fixed without changing what C is or even breaking compatibility with the current C.

C++ has some advantages over C, but it also has some unhelpful regressions, for example there's no way to automatically cast void pointers, you're forced to define struct members in the correct order, worse designated initializers, etc. I also just feel less happy using a language that is so bloated with things that I don't need or want, and that is under the control of a committee that is completely detached from what I care about and keeps adding more and more to the pile of unwanted bloat.

Note: the actual wording and semantics are not so important. For example below I propose always_inline as a keyword, but I don't actually care what it's called as long as it exists, it might as well be __always_inline__ or an attribute [[always_inline]] (since C23) or something else.

Some of these may require the compiler to make multiple passes, but I don't know of any good reasons for why that's a problem. It might be an issue for something like TCC, but from my perspective compilers for obscure meta use-cases should have their own restrictions where necessary, it's stupid to gimp one of the most popular languages in the world for everyone just because someone wants to compile C programs with their wrist watch.

"It would require big changes to the compiler" also isn't a valid argument. Compiler developers' laziness isn't a valid argument against making the language better.

"It would make the compiler slower" isn't a good argument because the compilers are already extremely bad and way slower than they should be. The unreleased Jai language is much more complicated than C, yet it's compiler compiles it 10x faster than C compilers compile C. If you were concerned about compile time you should be complaining to the compiler developers, not the language designers.


Undeniable improvements

As far as I'm concerned, there is no valid argument against these changes.

Nested functions

void check_adjacent (int x, int y) {
int count = 0;
void check (int x, int y) {
...
count ++;
}
check(x, y-1);
check(x, y+1);
check(x-1, y);
check(x+1, y);
}

I do this kind of thing all the time, but it's a GCC compiler extension. You can use goto to do something vaguely similar, but it will always be super clumsy and error prone compared to nested functions.

Nested functions can also be sent elsewhere as function pointers, here's another pattern I use all the time:

void load_assets () {
void callback1 (char* file_path, bool is_folder) {
...
}
read_folder_contents("/assets/images/", callback1);

void callback2 (char* file_path, bool is_folder) {
...
}
read_folder_contents("/assets/things/", callback2);
}

Optional stricter typedefs

I use a lot of IDs to manage and refer to things, but there's a problem:

typedef unsigned short Entityid;
typedef unsigned short Itemid;

void destroy_entity (Entityid id) {
...
}

void main () {
Itemid id = 1234;
destroy_entity(id); // You've created a bug, but the compiler won't complain.
}

This should not be possible. There's only one way to fix this; typedef a struct with 1 member that is the ID, because structs have stricter type checking. However that's one of those stupid workarounds that you shouldn't have to do just because the language lacks obviously useful features that would allow you to do it in the obviously correct way.

From language design perspective, fixing this could be as trivial as a new keyword:

typedef_strict unsigned short Entityid;
typedef_strict unsigned short Itemid;

This is ESPECIALLY important for enum values, although somehow it's less common for me to mix those up.

Automatic casting for struct literals

This is possible in C++, and I don't understand why it isn't in C.

typedef struct {
float x;
float y;
} Vec2f;

void test (Vec2f a, Vec2f b) {
...
}

void main () {
Vec2f v;

// Currently, you have to do this:
v = (Vec2f){5,22};
test(v, (Vec2f){1,2});

// You should be able to do just this:
v = {5,22};
test(v, {1,2});
}

I can faintly hear someone arguing that it's "harder to read the code" because you don't see the type next to every definition, but that's such a poor and invalid justification for denying this kind of massive code comfort and readability improvement that I'll just turn down that complaint outright. There's a ton of places where you have exactly the same kind of lack of direct visibility to the type and none of them are a problem. For example if you do float v; v = 0; you won't see what type v is in the second statement, it could be any base type including a pointer or a bool. You just look at the definition of the variable if you want to check what the type is. Besides, you could still add in the explicit cast if you wanted to.

Nested break and continue

for (x) {
for (y) {
if (y == 10) break; // Breaks the inner loop.
if (x == 10) break 2; // Breaks 2 loops, i.e. both.
}
}

With labels:

outer: for (x) {
inner: for (y) {
if (y == 10) break inner;
if (x == 10) break outer;
}
}

Labels already exist, but they're only used by goto. You can do it this way in current C:

for (x) {
for (y) {
if (y == 10) goto inner;
if (x == 10) goto outer;
} inner:
} outer:

It's pretty clumsy and ugly to play with labels this way though, the label needs to be on the other side of the braces for continue, and it interferes with your flow of programming when you have to change your way of thinking in this way. This is what it looks like if you want to prepare labels for both continue and break:

for (x) {
for (y) {
...
continue_inner:
} break_inner:
...
continue_outer:
} break_outer:

Just let me type break 2. Or alternatively, allow stacking breaks:

for (x) {
for (y) {
if (y == 10) break; // Breaks the inner loop.
if (x == 10) break break; // Breaks both loops.
if (x == 8) break continue; // Breaks the inner loop, and continues on the outer loop.
}
}

Arbitrary length character literals

I often do something like this:

int foo = *(int*)"help";

Exploiting string casting like this is dangerous and weird, the compiler won't be able to tell when I make a mistake. But if I could just set the value as a character literal (which is precisely my intention), the compiler would know that my intent is to set a 4 byte value onto the integer:

int foo = 'help';

I don't know what should happen if the size of the character literal is different from the integer. My intuition tells me that it shouldn't allow you to do this unless the size is exactly the same, i.e. both 'lol' and 'hello' should give a compiler error.

Currently, C only supports character literals with 1 character, but I don't see any reason why it couldn't support more.

New built-in constants: __function_id__, __function_count__

__function_id__ returns a unique integer for each function.

__function_count__ returns an integer for the total number of functions in the program.

These would make it easier to implement profiling and debugging functionality.

Custom string delimiters

When you define a string with a lot of quotes and/or new lines, it becomes a huge mess. The easy solution is custom string delimiters.

char* sometext = #string FOOBAR"Hello world,
this
is "a story" about
coding and stuff"
FOOBAR;

Safe alloca()

The function alloca() may fail, but there's no way to detect when it does. Simply add a new function that returns NULL when it fails, just like malloc().

Better ways to inline code

I want easier and cleaner ways to inline a bunch of stuff, but neither macros nor inline functions do it well.

Macros can't return a value, don't do type checking, and having to escape new lines and parenthesize every variable is very ugly and annoying. Inline functions don't actually inline the code the way you'd expect (it won't be optimized as well as properly inlined code would), and is not guaranteed to inline it to begin with.

Here's some ideas that could help, there's more approaches but I won't bother listing everything.

Function attribute: static_always_inline

Ideally the function shouldn't even exist in the program, it should only serve to inline it's contents where ever it's called from, before any kind of optimizations.

There's compiler extensions that always inlines and gives an error if it cannot do so, but I don't know if they exhibit the behavior I desire. There's also some stupid limitations to inlining that I don't understand, sometimes it just gives an error and refuses to inline for some arbitrary reason, because it thinks the stack of inlined functions is too long or something..? Better error messages that actually explain what the problem is would be great.

Multi-line macros

Macros get real ugly when they're long because you have to use backslashes to escape new lines. I don't care how, but there should be a way to define multi-line macros, maybe something like this:

#define_start add_foobar(n)
n += 100;
n *= 2;
#define_end

Compile-time functions

Just like you can mark a function as "inline", you should be able to mark it as "compile_time". This function would guarantee to return it's value at compile-time. If it cannot do that, it gives a compile error.

constexpr from C++ pretends to do this, but as far as I know even C++ doesn't have true compile-time-only functions.

More flexible struct manipulation

These aren't features that I specifically "want", but I feel that it would be more natural if the language worked this way. I also have a feeling that, if they were possible, I would find some uses for them.

Identical anonymous structs are compatible (and work in function arguments and return values

void test (struct {int x, int y} foo) {
...
}

void main () {
test({.x = 123});

struct {
int x;
int y;
} foo = {2, 50};

test(foo);
}

This would benefit greatly from "Automatic casting for struct literals" (see above).

Get struct member directly from return value

Vec2f get_pos () {
return {1, 2};
}
void main () {
float x = get_pos().x;
}

If you combine both of the features above, you could make your own implementation of multiple return values without requiring explicit support from the language for it:

struct {Vec2f pos; int status;} get_pos () {
if (error) return {.status = 123};
return {.pos = {10,25}};
}
void main () {
struct {Vec2f pos; int status;} result = get_pos();
if (result.status) {
printf("Failed!\n");
}
else {
do_stuff(result.pos);
}

Vec2f pos = get_pos().pos; // I only care about the result, not about the status.

int status = get_pos().status; // I only care about whether it succeeded, not about the result.
}

I don't know how useful this would be in practice, but I feel like there may be some situations where I'd want to use it. It's hard to know because I can't use it and get experience with it. But it's important to note is that both of the features that would make this possible are features that, in my opinion, should already be how the language works. This particular use case doesn't necessarily have to be a good idea, it still makes sense to make it possible.

This would be even more clean and practical if you could have an anonymous variable on the receiving end:

struct {Vec2f pos; int status;} = get_pos();
if (status) {
printf("Failed!\n");
}
else {
do_stuff(pos);
}

That might be going too far though... or would it really? I feel like this is still very C-like, it's just more syntactic freedom for interacting with structs.


Obvious improvements

These are things that would obviously improve C, but I can imagine arguments (however weak) against them.

Allow dereferencing struct pointers with period

Vec2f* pos = get_pos();
pos->x = 123; // Before
pos.x = 123; // After

This is easier to type (. vs - shift <), easier to edit since you always need the same number of key presses (e.g. arrow keys) to go through (especially when editing with multiple text cursors), much more readable, reduces friction and mental load since you don't have to think about which one to use, makes refactoring and writing code easier since you never have to switch from -> to . or vice versa, and as a cherry on top this can be added to the current C language without breaking anything.

The compiler knows that this is a pointer because it can complain about it, but it specifically chooses to complain instead of just letting you do it. There's currently no valid use for using a period, so you don't have to break anything by allowing it.

The most convincing argument against this I can imagine is that knowing when you're dereferencing is helpful somehow for the sake of making you more conscious of pointers. But this is the same as the argument against "Automatic casting for struct literals" above: it seems like an excuse you'd come up with to justify a decision that was already made, not an argument for making that decision in the first place.

It's that kind of annoying nanny features that make me hate other languages more than C. I want the language to get out of my way regarding things that don't matter and just let me do things with minimal typing, and only do real computer things in exactly the way that I type them. No secret sauce (other than optimizations) happening behind the scenes.

Struct comparison with ==

== should work for structs as long as they have the same type:

Thing foo = {};
Thing bar = {};
if (foo == bar) {
...
}

I deal with position/size/offset values a lot, being able to just check if 2 positions are the same with == would be really nice.

The main argument I can think for being against this is: the reason you can check equality for ints is that there's CPU instructions for them. However, there are also memcmp instructions for checking equality of arbitrary amounts of data, so I don't think CPU instructions are a good perspective to be against this either. This isn't an undeniable improvement because I'm not sure how ubiquitous those instructions are and on what CPU architectures.

Variable assignment must be it's own statement

if (foo = bar) { ... }
if (foo = getstuff()) { ... }

This shit shouldn't be allowed and I won't pretend to humor any code golfing arguments for it. This is a typo (== vs =) I've made multiple times and the compiler won't tell me about it because it's valid syntax. Runtime bugs caused by this kind of crap makes me feel like I'm programming in Javascript.

The only reason this isn't in the "Undeniable improvements" list is backwards compatibility. People have written code using this stupid syntax and changing it would break all that code.

Compilers will warn about this depending on how they're configured, but this shouldn't be allowed in standard C and anyone who thinks otherwise is objectively wrong.

Zero-initialization

Foo test; // All data initialized to zero.
Foo test = __no_init__; // Uninitialized, undefined behavior.

Initializing data to 0 is much more error-proof than having to do it yourself. I've had several hard to track bugs that were caused by me forgetting to initialize a variable somewhere in the depths of my codebase.

Leaving a variable uninitialized should be explicit behavior that you do for performance reasons.

This isn't an undeniable improvement because somehow I feel like this is, in some parts, just an opinion. I do think that zero-initialization is a more desireable default behavior though because that way you can't get a runtime error accidentally.

Making this change would not break compatibility with existing C software, in fact it may fix a bunch of bugs from existing software. It might reduce performance a tiny bit too though, but it's very hard to say how much.

16-bit floats (half floats)

I don't know how complete/consistent the support on most CPUs is for 16-bit floats, but most CPUs and GPUs do support these at least to some extent. I'd use these a lot if they were easily usable.

Apparently C++ supports them too.

Enums with bitfield values

bitfield enum {
FOO, // 0x01
BAR, // 0x02
ZYZ, // 0x04
XOR, // 0x08
};

Self-explanatory. You can of course define the values of an enum manually, but that's kind of annoying and error-prone.

This isn't an undeniable improvement because it's kind of a minor nitpick, such a minor nitpick that I could humor an argument that it's not worth the added keyword. That said, if enums support attributes, you could just use [[bitfield]]

Forced optimization settings

[[always_optimize]] static void very_costly_function () {
...
}
void main () {
int i = 0;
#optimize_start(3)
while (i < 100) {
do_stuff(i);
i ++;
}
#optimize_end
}

If you have some incredibly hot code path, you might want the compiler to always optimize it, even in debug builds. I have several of those.

If you added localized optimization settings, you might as well allow the opposite too: you might want a certain part of your code to be pure and not optimized.

Manual function overloading

Function overloading means that 2 functions can have the same name as long as the arguments are different. I actually don't think function overloading should be an inherent part of C the way it is in C++, the fact that each function exists separately and can always be called through it's true name has it's strengths.

That said, I think manual overloading would be a great way to get all the benefits.

void foo_int (int x) {
...
}
void foo_float (float x) {
...
}
__overload__(foo, foo_int);
__overload__(foo, foo_float);

void main () {
int x = 123;
float y = 1.5;
foo(x); // Calls foo_int
foo(y); // Calls foo_float
}

I could see arguments against this ("you don't know what's really being called"), but most of them went out the window as soon as C added _Generic which enables a worse way to do exactly this. The following is currently valid C code:

void foo_int (int x) {
...
}
void foo_float (float x) {
...
}
#define foo(x) _Generic(x, \
float:foo_float, \
default:foo_int)(x)

void main () {
int x = 123;
float y = 1.5;
foo(x); // Calls foo_int
foo(y); // Calls foo_float
}

You could argue that _Generic is better because you can see all the overloads from the same location, but that's also what makes it worse and less useful than true function overloading. Firstly you can't just include a bunch of data storage libraries with their own overloads of append(), and secondly _Generic doesn't allow you to make overloads with varying number of arguments (well, technically you can, I invented a convoluted workaround macro for it).


Advanced improvements

These are contestable, start to steer away from C-like language style, and it's hard to say what complications may arise from them.

Enum value namespaces and inferring

If you use enums a lot, having to prefix all the values is really annoying because you'll almost immediately have to start using such long prefixes that they're longer than the actual member name.

typedef own_namespace enum {
BIG,
SMALL,
MEDIUM,
} FACE_SIZE;

void do_stuff (FACE_SIZE size) {
...
}

FACE_SIZE size = BIG; // Error: BIG not defined.
FACE_SIZE size = .BIG; // Valid, BIG is inferred from the type FACE_SIZE.
do_stuff(.SMALL);

This might have unforeseen consequences that I'm not seeing, mainly because this is similar to struct member initialization:

typedef struct {
FACE_SIZE a;
FACE_SIZE b;
} Thing;

Thing foo = {1, 2};
Thing foo = {.BIG, .SMALL};
Thing foo = {.a = .BIG};

Anonymous enum variables

struct Dude {
enum {
HAPPY,
ANGRY,
SAD,
} emotion;
};

Dude dude;
dude.emotion = .HAPPY;
dude.emotion = Dude.emotion.HAPPY; // Maybe this if the above won't work..?

Very often I want some simple state that is tied to a specific struct or function like this and doesn't need a separate enum.

This isn't an undeniable improvement because there might be syntax complications that I haven't considered. This probably also requires "Enum value namespaces and inferring" mentioned above.

#defines with correct scopes

When you #define something inside a function, it becomes available everywhere, not just inside that function. That's obviously incorrect behavior, I think the reason it happens is because the pre-processor isn't aware of functions or scopes. I don't know what's the solution to fixing it, all I know is that the current behavior is obviously incorrect and should be corrected.

#on_leave

void main () {
void* data = malloc(1000);
#on_leave{ free(data); }

if (x) return; // free(data) is inserted here.

for (int i=0; i<100; i++) {
Thing* thing = get_thing();
#on_leave{ release_thing(thing); }

if (x) continue; // release_thing(thing) is inserted here.
if (y) break; // release_thing(thing) is inserted here.
if (z) return; // release_thing(thing) and free(data) are inserted here.

// release_thing(thing) is inserted here.
}

// free(data) is inserted here.
}

#on_enter_function, #on_leave_function

This would make profiling and debugging related systems much easier to implement.

#on_enter_function{ printf("Hello! %s\n", __func__); }
#on_leave_function{ printf("Bye! %s\n", __func__); }

void testfunc (int x, int y) {
x += y * 2;
if (x > 1000) {
return;
}
printf("x=%i y=%i\n", x, y);
}

The code above would equate to the following:

void testfunc (int x, int y) {
printf("Hello! %s\n", "testfunc");
x += y * 2;
if (x > 1000) {
printf("Bye! %s\n", "testfunc");
return;
}
printf("x=%i y=%i\n", x, y);
printf("Bye! %s\n", "testfunc");
}

This would be especially powerful when combined with __function_id__ mentioned earlier in this page. You could automatically store information about function calls.

Ideally you would have some kind of tagging system that allows you to add arbitrary tags to functions, which can then be used to enable or disable these. I'm not sure exactly what that would look like though.

Metaprogramming through scripting

Imagine you could write a bit of C code that generates code. Something like this perhaps:

enum {
#comptime {
File file = read_file("things.txt");
while (1) {
char* word = read_word(file);
if (!word) break;
#paste_code(word)
#paste_code(",")
}
}
};

This code would read a file and write each word in that file as a member of the enum.

I can't do anything like this so I haven't had the opportunity to think of how exactly this could work. I think there's a couple newer languages that can do this, I haven't looked into how they do it though.


Breaking changes

I wouldn't recommend these for C because they break compatibility too much, but they would nonetheless make a better language. If you made a C clone language, you would do these.

Pointer is a property of the type, not of the variable

int* foo, bar; // Correct.
int *foo, *bar; // Stupid and wrong.

typedef syntax is rotated

// Before:
typedef unsigned int u32;
typedef struct {
int x;
} Foo;

// After:
typedef u32 unsigned int;
typedef Foo struct {
int x;
};

This way is more readable in my opinion, especially when it comes to structs.

This also allows you to universally search your codebase for all type definitions by searching for "typedef Name".

String/array length

Strings would be significantly easier to use if they had a length. However, there isn't an obvious "fix" because there's lots of ways to use strings. You'd just be trading an imperfect string system for another (although perhaps less) imperfect one.

The same applies to arrays, it's a bit silly that you currently need to get the array size separately. But even more than strings, you often define the length of an array with a #define constant, so the language adding a length on it's own is completely useless, which may rub some people (me included) the wrong way. I don't want the language to do what I don't want it to do, and that's one of the reasons I like C, because it usually only does exactly the thing I tell it to do and nothing else.

On top of length vs no-length, there's also a question: what type should the length variable be? Should it be 32-bit? 64-bit? Should it be signed or unsigned? Depending on what you're programming (for example a HTTP server with 10 million simultaneous clients), a 64-bit integer for every string may use more space than you want to spend, perhaps you do math with it and benefit from a signed length value that and won't overflow when it goes negative, or perhaps you have your own storage mechanism that doesn't need it at all.

I almost never use C strings in my code, I typically do something like this:

typedef struct {
u64 length;
char* data;
} String;

#define string(s) (String){.length=strlen(s), .data=(s)}

String something = string("Hello world");

But I don't always need it, sometimes I just want the string to be a pointer to data.

In the past I've had the idea to replace strings and arrays with a metaprogramming mechanism. The idea would be that when you create an array (strings being just arrays of u8's), the language creates aliases between the desired variables and converts/moves them around as necessary or creates them if they don't exist or uses literal values if they doen't need to be stored at all. This gets a bit weird though, and has problems if for example you want to send the string as a pointer.

I think ultimately the best answer to the array length question would be to allow you to define new string/array types which define the structure. YOU as the programmer decide what the string or array format should look like. The main problem with my macro is that it's extremely annoying wrap every single damn string into the string() macro. Maybe the type could be used as a prefix like this: String foo = String"Hello";. That would make it at least slightly easier, I might even rename the type from String to S. Or maybe prefix"" could be a special type of string macro, and prefix[] an array macro. I'm not sure.

It's also very difficult to do certain things with just a macro, for example what if I want to put the string length as a 16-bit integer into the beginning of the string? I've used that kind of strings on several occasions, but I don't know how to do that at compile-time. With customizable string types the compiler could inject the length to the beginning of the string data automatically.

Compile-time programming could be another answer to how to implement this, although I'd prefer the compiler to just understand what I want and do it.

Switch cases break by default, and explicit fallthrough

switch (x) {
case 0: fallthrough;
case 1: printf("1\n"); // No need for break.
case 2: printf("2\n");
case 3: printf("3\n");
}

In 99% of cases you will want to break, so it should be the default behavior. Making break explicit would also allow you to break from an upper loop when you're in a switch case.

Better variable arguments

The problem with C variable arguments is that you can't know what the arguments are from the function-side, you need to explicitly provide some kind of type information when you call the function. I don't have a good design for better varargs, but I assume that some kind of TypeID system would be involved.

You could add a new variable arguments system into current C without breaking compatibility simply by using a special keyword for the new system. I don't know if it's a good idea to have 2 though.

Different base type names

I don't really care since I can just typedef whatever I want in C currently, but I think the default type names are silly and you shouldn't use them if you were to remake C today. Here's what I would use as base types:

Most new languages seem to do something like this, and I think that's a good decision. The abstract names of C might have made more sense in the distant past when CPU register sizes were less consistent, but these days effectively all CPUs or GPUs are very consistent with these.

There may be some merit to having a type called "int" because sometimes you just want a number and don't really care what size it is. It should be a i64.

There's a few reasons for prefixing signed integers with i instead of s. Firstly, you rarely think of the words "signed integer", I think it's more common to think of just "integer" or "int" which starts with "i". Meanwhile when you think of an unsigned integer, you specify with the word "unsigned" which starts with "u". Secondly, the letter "s" also reminds of "string" and "struct", while "i" doesn't really remind of anything else. When I see "s32", it immediately makes me think of some kind of string. I might want to use a string that uses a 16-bit integer for it's length, which is what "s16" sounds like.

Enum members and struct members split by semicolon

enum {
FOO;
BAR;
ZYZ;
};

This is a very minor nitpick, but I typo these very often because everything else is defined like this. If I'm not consciously thinking about it, I'll end up typing semicolons here.

If you define multiple constants, you use semicolons. If you make a struct, the members are split by semicolons. If you make a bitfield with booleans, you use semicolons. But if you make an enum, you use commas... why?

I'd want the same for struct initialization:

Vec2f pos = {
.x = 10;
.y = 5;
};

Other than consistency, there's a big advantage in that if you decide to switch between that and the following (which I end up doing quite often), it's much easier to make the change:

Vec2f pos = {0};
pos.x = 10;
pos.y = 5;

Importing and building

There's nothing too interesting to say about this. It's obvious that building programs and including libraries and doing forward declaration in C is crap. Most other languages do it better, and the correct solution, at least to me, seems like it goes without saying.

#import "coolarray.c"

Coolarray test;
init_array(&test);
#import "coolarray.c" ca

ca.Coolarray test;
ca.init_array(&test);

When you import a file, the compiler pulls it in and finds all the types and functions from it, and it just works, and keeps that data around in case any other file imports it too.

Some parts of the C macro system stops working if you do this, but I can't think of any notable issues. The biggest confusing thing is that if you do a #define and then #import something, you can no longer #undef it from the outside, it will get "baked in" to that file and all the other files that it imports. That's probably a good thing though, it doesn't make sense for 2 files to import the same header with different modifications. It would be beneficial to change your thinking about #defines, or perhaps even have a new keyword like #global, and think of them as global project-wide settings.

Here's some other related details:

// Public functions are accessible by anyone who imports this file. This is default.
public void foo () {
...
}
// Private functions are unavailable even if you import this file.
private void bar () {
...
}
// Same as public, except this will also be visible from an object file or DLL.
export void zip () {
...
}
#import export "coolarray.c" // coolarray.c will be implicitly imported by any other files that import the current file.
#import global "coolarray.c" // coolarray.c will be implicitly imported by all files. Useful if you want some base libraries to always be imported in all project files.
#force_import "coolarray.c" // Forcibly imports everything, treating all private functions and variables as public.

For context, I always compile my programs as a single unit. I have my own way of not needing header files. This makes C programming a lot more comfortable for me, but it's still limited compared to the design above.


Remaining desirables

Even if all the changes above were applied, there's still neat things I'd like to have. However, these can't easily be added to C without significantly changing the language.

Function syntax

function add_ints (int x, int y) static -> int {
return x + y;
}

function* callback (int x, int y) -> int = add_ints;

function main () {
callback(1, 3);
}

There's numerous benefits from doing something like this.

The biggest downside (besides totally breaking compatibility) is that it's longer to type, and it looks different from what you're used to. It really doesn't "look like C" at all.

Cleaner macros

In many cases I would prefer if macros could do type checking, be more easily contained, be able to return a value, and in general behave more like inline functions.

I don't have a good design or examples at the moment, but I feel like the entire macro system in C could be replaced with a more controllable one without losing any of it's current capabilities. For example I think you could add optional type checking into the current macro system without breaking compatibility with current C code.

Adding enum members dynamically

enum ERROR {
NONE,
};

ERROR foo () {
if (x) return #unique_enum(ERROR, BAD_THING_HAPPENED);
if (y) return #shared_enum(ERROR, MEMORY_ALLOCATION_FAILED);
return ERROR.NONE;
}
ERROR bar () {
if (x) return #shared_enum(ERROR, MEMORY_ALLOCATION_FAILED);
return ERROR.NONE;
}
void main () {
ERROR error = foo();
if (error) {
printf("Oh no! %i\n", error);
}
}

I've always wanted something like this: global error numbers. An enum that you can expand dynamically. The value could flow all the way from some inner function to top-level code and retain it's meaning. I think I would have to build an actual programming language before I can say anything more about it.

With some additional introspection metaprogramming features (for example enum_name_as_string(ERROR,value)), you could even print the error name regardless of where it comes from.

This could also be useful for other purposes, for example #including a UI module which adds a new UI node type to a UI system.

Struct templates

If you've ever implemented custom arrays or buffers in C, or perhaps certain kinds of math, you may know what I mean.

struct Array {
u32 count;
Something* data;
};

The Something here should be whatever's in the array. There's no good way to re-define this with a different pointer type without tedious macro crap. But here's what it could look like:

struct Array(T) {
u32 count;
T* data;
};

void print_ints (Array(int)* array) {
for (u32 i=0; i<array.count; i++) {
printf("%i = %i\n", i, array.data[i]);
}
}

Array(int) array_of_ints;

At first this may seem icky and un-C-like, but if you really think about it, this isn't much different than the hideous macro hacks that people tend to present when they want to do something unusual. All this does is expand to a different struct based on what the input is, and you can't define a function that takes in Array(T), it has to have a type attached to it.

An interesting note is that you could do this without any additional features if, as mentioned earlier in this page, identical anonymous structs were compatible. Then you could just do something like this:

#define Array(T) struct{ u32 count; T* data; }

Array(int) array_of_ints;
// The line above will expand to the line below:
struct{ u32 count; int* data; } array_of_ints;

Since anonymous structs aren't compatible, you cannot send this to any function. I guess this might also slow down compile times since the compiler would have to interpret and compare tons of anonymous structs everywhere.

Struct templates would also be useful for other purposes. For example I use generic vector structs for all kinds of positions and offsets, I need those structs in basically all base types (for whatever packing reasons). But instead of defining Vec2_u16 along with a thousand other things with macros, I would prefer to use Vec2(u16) and have it interpreted automatically.

The utility of struct templates is limited by the fact that functions cannot take in an arbitray struct template. For example you cannot create a generic multiply() function that can multiply any type of vector. For that you would need function templates, which completely destroy the idea of remaining C-like. You CAN use struct templates for arrays though (because all pointers are technically interchangeable), see below for an example.

Function templates

This is a feature that would make my life a LOT easier, but also the feature that is hardest to justify for being added to C.

Depending on how ugly you're willing to go, you can work around to some template-ish stuff even with current C. Here's one of my own array types that I use:

#define DEFINE_TMFARR(name, T) \
typedef struct { u32 count; u32 limit; T* data; } name;

DEFINE_TMFARR(Tmfarr, void)

void tmfarr_append (Tmfarr* a, u32 itemsize, void* item) {
...
}
#define tmfarr_append(a, item) \
_tmfarr_append((Tmfarr*)(a), sizeof(*(a)->data), (item))
DEFINE_TMFARR(Tmfarr_int, int)

void main () {
Tmfarr_int ints = tmfarr_new(Tmfarr_int, 64);
tmfarr_append(&ints, &(int){123});
for (u32 i=0; i<ints.count; i++) {
printf("%i = %i\n", i, ints.data[i]);
}
}

The idea is to have a base array type that uses a void pointer, and then define new arrays for other types. When you use one of the array functions, the array gets cast to the base type, and the pointer type size is extracted and sent to the function.

The only reason this works is because pointers are interchangeable, the same trick won't work for vector templates.

There's a lot of questions about how exactly template functions should work. For example in C++ you have to always add the type separately after the function name when you call the function array_append<int>(...);. Perhaps this is necessary for some reasons, but when you're programming, you just want to use the array and function as if they're regular structs and functions.

Better error messages

This isn't really a C problem, this is a compiler problem. I use GCC and the error messages are horribly bad.


Making my own language?

I've tried to make a language on a few occasions. The conclusion I reached is that in order for my own language to be an improvement over C, I need to make a really proper language and not half-ass it. Problem is that making a proper language requires way more work than I'm interested in putting into such a project.