TFD
TFD is my vision for a programming language. It is basically an overhauled version of C, meant to be cleaner and more comfortable to use, mostly by being less pedantic and allowing you to do the same kind of things with less effort. In this page I mostly describe it from the perspective of how it differs from C. This page also functions as documentation for myself.
Click here for the story.
C is the programming language that is closest to my ideal, but there's a lot things about it, big and small, that are very clumsy or annoying. Just to give an example, when you define the value of a struct for a variable or function call, you have to type (Somestruct){x,y} even though there isn't any good reason why {x,y} wouldn't be enough.
Some of the problems, such as the problem above, are fixed by C++. However, almost all of the problems I have with C also exist in C++, it introduces some new problems (for example you can't type {.y=y,.x=x}, it has to be in the correct order {.x=x,.y=y}), and it just feels bad to use a language that is so overcomplicated. It makes me feel like I'm on unstable ground and that the survival and propagation of the language (and thus my codebase) is complex and uncertain.
There's other languages that proclaim to fix C or be a better version of it, but all of them miss the point of what I actually want and care about.
One of the core philosophies of TFD is that it's not the programming language's place to tell the programmer what's the right way to program. That's why you can configure most behaviors and rules with compiler settings, and it doesn't withold features or impose restrictions for ideological reasons.
I've gone through multiple phases of wanting to make different kinds of languages, or a simple pre-processor for C, but I always end up feeling like the costs outweigh the advantages. I am trying to implement some features with SBC in a way that doesn't require a proper language.
I may make TFD some day when I find the right motivation and time. The biggest problem is that I don't really want to learn to use LLVM, and the only alternative is to transpile into C code but that comes with it's own complications. I'm also considering learning enough x86 that I could just output an executable directly, but that couldn't be optimized at all.
Here's some random sample code:
#module "basedefs.tfd";
#module "print.tfd";
#module "memory.tfd" mem;
struct Vec2f {
f32 x;
f32 y;
};
struct Entity {
enum STATE u8 {
NONE;
ALIVE;
DEAD;
INVINCIBLE;
};
STATE state;
STATE state_previous;
Vec2f pos #inherit;
};
function create_entity (Entity.STATE state, Vec2f pos) Entity {
return {
.pos = pos,
.state = state,
.state_previous = .NONE,
};
}
function main (int arg_count, &String args) int {
#import "fireworks.tfd";
// &Entity enemies = mem.alloc(32*32*#sizeof(Entity));
&[32][32]Entity enemies = mem.alloc(#sizeof([32][32]Entity));
int fireworks_done = 0;
function do_fireworks (&Entity entity) ERRNUM {
ERRNUM e = spawn_fireworks(entity.x, entity.y);
if (!e) fireworks_done ++;
return e;
}
for (int y=0; y<32; y++) {
for (int x=0; x<32; x++) {
Entity enemy = create_entity(.INVINCIBLE, {(*)x, (*)y});
ERRNUM e = do_fireworks(&enemy);
if (e) {
break 2; // Break both loops.
}
// enemies[y*32+x] = enemy;
enemies.[y][x] = enemy;
}
}
printf("Did {} fireworks!\n", fireworks_done);
return 0;
}
NOTE: ALL keywords are prefixed with # by default, basedefst.tfd defines un-prefixed names for the most common keywords. All the code below in this page will use the raw syntax, but in reality you're expected to use basedefs.tfd (or base.tfd which includes a bunch of other basics) or define names according to your preference.
Base types
u8, u16, u32, u64 // Unsigned integers. The number is the size in bits.
i8, i16, i32, i64 // Signed integers.
f32, f64 // Floating point types.
void // No type.
bool // #true (1) or #false (0). Unsigned integer with the smallest directly addressable size (i.e. you must be able to point a pointer to it), effectively always u8. This is technically an enum with strict type checking.
int // Integer that matches the biggest natural register size, almost always i64. This can also communicate that you don't have a reason for a specific size.
uint // Unsigned version of the above.
Notable syntax differences from C
Examples of syntax in C, followed by the equivalent in TFD.
- There are no strings by default, see strings for how strings work in TFD.
- Arrays are not equivalent so they cannot be directly compared, more about them below.
- Pointers are dereferenced with
*like in C, however it goes immediately before the member you're dereferencing, and has maximum precedence over all other symbols..will dereference once if the variable on the left is a pointer.- Think of
&as "address of".// Pointer to int.
int *thing = NULL;
&int thing = #null;
// Pointer to struct
thing->x = 123;
thing.x = 123;
// Pointer to pointer to struct
(*thing)->x = 123;
*thing.x = 123;
// Struct member pointer
*thing.x = 123;
thing.*x = 123;
// Struct member pointer to struct member pointer
*(*thing.x).y = 123; *thing.x->y = 123;
thing.x.*y = 123;// Typedef.
typedef int Something;
#typedef Something = int;
// Struct.
typedef struct { xxx; } Foo;
#struct Foo { xxx; };
// Union.
typedef union { xxx; } Foo;
#struct Foo #overlap { xxx; };// Function.
static int foo () { xxx; }
#function foo () int { xxx; }
// Function pointer.
int (*foo) () = NULL;
&#function foo () int = #null;// Import from pre-defined directories.
#include <foo.h>
#module "foo.tfd";
// Import from local path.
#include "foo.h"
#import "foo.tfd";// Switch.
switch (foo) {
case 1:
case 2: break;
default: break;
}
#if (foo) ... {
#case 1; #next_case;
#case 2;
#case;
}
Value literals
int value = 1222333444555666777; // No need to postfix this kind of number with "LL".
int value = 0xFFAABB; // Hex value.
int value = 0b0000111100001111; // Bit value.
// All number types will completely ignore underscores (except inside the 0x or 0b prefixes). Can be used at your discretion to make the number more readable.
int value = 1_222_333_444_555_666_777;
int value = 0x_FF_AA_BB;
int value = 0b_00001111_00001111;Character literals.
u32 value = 'X'; // 0x58
u32 value = 'Help'; // 0x706C6548
u32 value = '❤'; // 0xA49DE2The size of the character literal must be equal or smaller than the type.
'Hello'would give an error here because u32 is only 4 bytes. If the type is larger than the value, 0s are added to the end. The data is in text byte order, basically the equivalent of this in C:u32 value = *(u32*)"X\0\0\0";
u32 value = *(u32*)"Help";
u32 value = *(u32*)"❤\0";
Pointer offsets
&u32 foo;
foo ++; // Moves the pointer by 4 (#sizeof(u32)) bytes.
foo &++; // Moves the pointer by 1 byte.
foo[2] = 500; // Modify a value from offset #sizeof(u32)*2.
&u32 bar = foo + 2; // Gets an 8-byte (#sizeof(u32)*2) offset to foo.
&u32 gar = foo &+ 2; // Gets a 2-byte offset to foo.
Arrays
Arrays are treated the same way as structs, they are passed and copied by value. In C arrays are treated as a weird fake pointer.
foo.[x]will access a member of the array,foo[x]is an offset pointer dereference (same as in C). If accessing a member with a simple number offset, you can omit the brackets and just dofoo.x[4]int a;
a.1 = 123;
printf("Size in bytes is {}, it has {} ints\n", #sizeof(a), #countof(a));
#function test2 ([4]int x) [4]int {
#return x;
}
a = test2(a);
// The above code is identical to this:
#struct Arr4 {
int item0;
int item1;
int item2;
int item3;
};
Arr4 s;
s.item1 = 123;
printf("Size in bytes is {}, it has {} ints\n", #sizeof(s), #sizeof(s)/#sizeof(s.item0));
#function test1 (Arr4 x) Arr4 {
#return x;
}
s = test1(s);It's important to internalize the difference to C arrays because offsetting a pointer to an array will offset the pointer by the whole array size, not the item size.
[4]int array;
&[4]int a = &array;
a[1]; // a &+ #sizeof(int)*4, this will overflow the array
a ++; // a &+= #sizeof(int)*4
a.[1]; // a &+ #sizeof(int), access second item in the array. Like with structs, period will implicitly dereference once if needed.Array's pointer type is (by default) compatible with the item's pointer type.
#function print_floats (int count, &f32 items) {
#for (int i=0; i<count; i++) {
printf("{} = {}\n", i, items[i]);
}
}
[100]f32 array;
print_floats(#countof(array), &array); // Even though the function wants float pointer, a float array pointer will be accepted too.
Zero initialization
int foo; // Initialized to 0.
int foo #no_init; // Uninitialized.
Multiple switch case values
#if (foo) ... {
#case 1, 2, 3; print("1, 2, or 3!\n");
#case 4 ... 9; print("4, 9, or somewhere inbetween!\n");
#case; print("unknown...\n");
}
Nested functions
#function check_adjacent (int x, int y) {
int count = 0;
#function check (int x, int y) {
...
count ++;
}
check(x, y-1);
check(x, y+1);
check(x-1, y);
check(x+1, y);
}#function load_assets () {
#function callback (String file_path, bool is_folder) {
...
}
read_folder_contents("/assets/images/", callback);
// You can shove it directly into function arguments, this is identical to the above except the function doesn't have a name.
read_folder_contents("/assets/things/", #function (String file_path, bool is_folder) {
...
});
}There might be restrictions in some cases, see Additonal thoughts.
Type definitions
#typedefis used to create duplicate types and aliases.#structand#enumcreate a type automatically if they're followed by a name.The biggest reason (besides self-documenting code) to use typedef is to define type-checking rules:
#typedef Itemid #strict = u32;
#typedef Entityid #relaxed = u32;
#stricttypes are not compatible with anything other than itself. This is the default for named enums.#abitstricttypes are compatible with relaxed types, but not with other abitstrict types. This is the default for primitive typedefs.#relaxedtypes are compatible with everything except strict types. Regular integer/float values are relaxed types. Good if you want your code to be self-documenting, but don't want the compiler to be picky about your ints or whatever.You can change the default strictnesses with build rules. Integers/floats have additional rules outside of this categorization, by default you can't set a value if there may be loss of information (float -> int, u32 -> u16, signed -> unsigned or vice versa).
#compatible_typescan be used to override strictness rules and make types compatible with each other. Only works for types with the same size and structure.#struct Vec2i {
int x;
int y;
};
#struct Location {
int horiz;
int vert;
};
#struct Dimensions {
int width;
int height;
};
#compatible_types Vec2i, Location, Dimensions, [2]int;
#function test (Location pos) { ... }
Vec2i vector;
Dimensions size;
[2]int array;
test(vector + size);
test(array);
Enums
#enum COLOR u8 {
RED;
GREEN;
BLUE;
};
#enum COOL_BITS #bitfield {
FOO; // 0x01
BAR; // 0x02
ZYZ; // 0x04
XUL; // 0x08
};
#function paint_the_wall (COLOR color) {
...
}
paint_the_wall(COLOR.BLUE);
paint_the_wall(.BLUE); // Same as above.
COOL_BITS bites = .FOO | .XUL;If
#enumis immediately followed by a name, it creates a new type. If there's a name at the end, it creates an enum variable.#enum { RED; GREEN; BLUE; } color = .GREEN;
#if (color == .RED) {
color = .BLUE;
}Name is entirely optional, this basically creates a bunch of constant integer values.
#enum { RED; GREEN; BLUE; };
int color = RED;Switch statements have a special modifier that requires every enum value to have a condition.
#if (color) #complete_enum ... {
#case .RED; print("Rad!");
#case .BLUE; print("Bleu!");
}
// Error: a case for .GREEN is missing from switch.There's some compile-time constants for enums:
#highest_enum_member(COLOR)Expands to the member of the enum with the highest value.#enum_member_names(COLOR,String)Expands to an array of strings (with the specified string macro) containing all the names, mapped to their equivalent integer values. This will give an error if the array exceeds a maximum configured size.#all_bits(COOL_BITS)Expands to a value with all the bits of all the values merged together. Meant for bitfields, but works on normal enums too.
Structs
#struct Vec2f {
f32 x;
f32 y;
};
#struct Thing #pack(1) { // Members are tightly packed.
u8 foo;
f32 bar;
Vec2f pos;
};Like with enums, you can create struct variables.
#struct { int x; int y; } thingy = {5, 20};
#if (thingy.x > 100) {
thingy.x -= 100;
}Structs don't need to have a name at all, they can be used to describe grouping or get more precise control over padding/offsets of multiple variables.
#struct { int foo; int bar; } = {5, 20};
foo = 2000;There are no "unions", to get the equivalent of a C union, you must manipulate the offsets of struct members. Here's some examples:
#struct Coolthing {
f32 foo #offset(0);
i64 bar #offset(0);
};
#struct Coolthing #overlap { // Easy way to give all members an offset of 0.
f32 foo;
i64 bar;
};#struct Coolthing {
f32 foo;
#struct #overlap {
f32 bar;
i32 zorg;
};
};
#struct Coolthing { // Alternate way to define the same as above.
f32 foo;
f32 bar;
i32 zorg #offset(.bar);
};#struct Splitnumber {
u64 full;
u32 lower #offset(.full);
u32 upper #offset(.full+4);
};A struct can be padded to a specific size. For example you could make a struct whose size gets increased to the nearest 32 bytes, making it easier to use 256-bit AVX on it:
#struct Simdthing #align_size(256/8) {
u8 x;
f32 y;
};Inherited struct members:
#struct Vec2f {
f32 x;
f32 y;
};
#struct Tree {
int leaf_count;
Vec2f position #inherit;
};
Tree tree;
tree.position.x = 123;
tree.x = 123; // Same as above, .x is inherited from Vec2f position.To be clear since "inheritance" is a thing in object oriented programming, this isn't that. This has no effect on the data or struct or behavior, it only enables alternate syntax for accessing the child members.
You could also think of an inherited member this way:
#struct Vec2f {
f32 x;
f32 y;
};
#struct Tree {
int leaf_count;
Vec2f position;
f32 x #offset(.position.x);
f32 y #offset(.position.y);
};Structs cannot have "private" members.
Structs members can have default values. Whenever you create a variable with the struct, the members are secretly assigned to the default values.
#struct Thing {
int x = 14;
int y = 500;
};
Thing foo;
printf("{},{}\n", foo.x, foo.y); // "14,500"Default values can be particularly helpful when defining struct variables:
// Without default values (C-style):
#struct {
int x;
int y;
} foo = {
.x = 14,
.y = 500,
};
// With default values:
#struct {
int x = 14;
int y = 500;
} foo;Braces will be interpreted as the relevant struct/array type based on context, in C you would have to put a type cast before them.
#struct Vec2f {
f32 x;
f32 y;
};
#function test_vecs (Vec2f a, Vec2f b) { ... }
#function test_array ([4]int a) { ... }
Vec2f v;
v = {5,22};
test_vecs(v, {1,2});
test_array({1, 2, 4, 5});Miscellaneous examples of variable grouping with structs:
#function get () #struct {int x; int y} {
#return {1, 2};
}
#function give (#struct {int x; int y} foo) {
print("Gave {} and {}!\n", foo.x, foo.y);
}
#function main () {
#struct {int x; int y} foo = get();
printf("{},{}\n", foo.x, foo.y);
give(foo);
give({.x = 123});
#struct { int x; int y; } = get();
printf("{},{}\n", x, y);
y = get().y;
}
Nested types
Types can be nested inside structs. A nested type doesn't do anything by itself, it's just a regular type with a special namespace. You can nest types just for the heck of it, but the intention is to use it for the members of the parent struct.
#struct Vec2f {
f32 x;
f32 y;
};
#struct Entity {
#enum STATE u8 {
NONE;
ALIVE;
DEAD;
INVINCIBLE;
};
STATE state;
STATE previous_state;
Vec2f pos;
};
// Nested types can be used from the outside like this.
Entity.STATE state = .ALIVE;
state = .DEAD;
// Surprise: ALL members can be used as types, not just nested types.
Entity.pos position = {15, 20};
Vec2f position = {15, 20}; // Same as above.
Struct templates
#struct Array(T) {
u64 count;
&T data;
};
#function print_ints (&Array(int) a) {
#for (u64 i=0; i<a.count; i++) {
printf("{} = {}\n", i, a.data[i]);
}
}
#function main () {
Array(int) a;
print_ints(&a);
}Note: this is only a convenient way to define alternate versions of a struct, there are no template functions.
Operators on structs and arrays
#struct Vec2f {
f32 x;
f32 y;
};
Vec2f foo;
Vec2f bar;
#if (foo == bar) { // Compare each member of the structs.
...
}
Vec2f zip = {
.x = foo.x+bar.x,
.y = foo.y+bar.y,
};
Vec2f zip = foo + bar; // Same as above, just adds each member.
Vec2f zip = {
.x = foo.x*100,
.y = foo.y*100,
};
Vec2f lol = foo * 100; // Same as above, just multiplies each member.[2]f32 foo;
[2]f32 bar;
#if (foo == bar) { // Compare each member in the array.
...
}
[2]f32 zip = {
.0 = foo.0+bar.0,
.1 = foo.1+bar.1,
};
// Same as above, just adds each member.
[2]f32 zip = foo + bar;
// You could also think of it as a loop.
#for (int i=0; i<#countof(zip); i++) {
zip.[i] = foo.[i] + bar.[i];
}Operators can be used across types if they are made compatible with
#compatible_typesand have the same member types/offsets.#struct Vec2i {
int x;
int y;
};
#struct Dimensions {
int width;
int height;
};
#compatible_types Vec2i, Dimensions, [2]int;
[2]int array;
Dimensions size;
Vec2i pos = array + size;
#if (array == pos) {
...
}
Multi-break and continue
#for (...) {
#for (...) {
#if (y == 10) #continue; // Continues the inner loop.
#if (x == 10) #continue 2; // Breaks the inner loop and continues the outer loop.
#if (x+y == 1000) #break 2; // Breaks both loops.
}
}Break works on scopes:
{
{
#if (y == 10) #break #scope; // Basically goto to the end of the current scope.
#if (x == 10) #break #scope 2; // Same except the outer scope.
}
}By putting a label before a loop/scope, you can use break on it.
:outer: #for (...) {
:inner: #for (...) {
#if (y == 10) #break inner;
#if (x == 10) #break outer;
}
}
Strings
TFD has no strings by default. In order to use a string, you must first define what kind of string you want. The language comes with a library
#module "string.tfd";which defines a string type, all standard libraries use it too.String constants are created with a "string macro" system. If the macro is not explicitly used, it is automatically picked based on the context.
String foo = "Hello world"; // Struct with length and data pointer.
Cstring bar = "Hello world"; // Pointer to u8, a 0 is appended to the end of data.
// The string literals above are actually using string macros. The macro is inferred based on the type of the variable, but you can use the macro explicitly:
String foo = Stringxxx"Hello world";
Cstring bar = Cstringxxx"Hello world";These two string types look something like this:
#struct String {
i64 length;
&u8 data;
};
#typedef Cstring #strict = &u8;The macros are created like this (note that these work somewhat differently than regular #macros):
#macro Stringxxx"" \
#inferred_type String \
#place {.length=#characters,.data=#data}
#macro Cstringxxx"" \
#inferred_type Cstring \
#place #data \
#append_invisibly u8 0
#inferred_type= the type that causes this macro to be inferred. If this is not defined, then the macro must always be used explicitly. 2 string macros cannot use the same inferred_type, but by typedefing a new one like Cstring here, it won't conflict with a &u8 string macro.#place= the string constant will be replaced with this.#append#prepend= Append or prepend something to the string data.#append_invisibly#prepend_invisibly= Same as above, except these do not increase the value of #bytes or #characters.#align_size= Works the same way as with structs, this will pad the data with 0s until it aligns to the desired size.#bytes#characters#data= integer for the number of bytes, integer of the number of characters, and pointer to the string data. The data is UTF-8 by default but can be inserted in a few different formats, for example#data_utf16.#data_asciican be used to enforce ascii-only content.Note 1: I added xxx to the macros for clarity, in reality they would use the same name as the type.
Note 2: This syntax for defining a string macro is just the first idea I came up with. Sometimes the simplest answer is the best, but there's probably a better way to do it.
Here strings / multiline string literals / custom string delimiters
2 backticks
``can be used in place of normal quotations"to create a multi-line string. Helpful if you want to include a lot of text, like help texts or GPU shaders or something with a lot of normal quotation marks.Indentation is ignored up to the same level as the line that the ending backticks are on. 1 empy line immediately before and after the backticks are also ignored.
String sometext = ``
Hello world,
this is "a story" about
coding and stuff!
``;
// Defining the string above in C would be done like this:
char* sometext =
"Hello world,\n"
"this is \"a story\" about\n"
"coding and stuff!";The backticks actually create a special delimiter, you can optionally add a word inside of them:
String sometext = `STR`This is a cool `piece of text` with some ``backticks`` all over it, they won't end the string because of the custom delimiter word.`STR`;Explicit string macros work the same way as with normal strings:
Cstring sometext = Cstringxxx`STR`This is a "cool" piece of text..!`STR`;
Built-in constant values
#lineInteger for the current line number.#file_nameString for the current file's name.#file_pathString for the current file's relative path (not including name).#file_full_pathSame as #file_path except the full system path.#func_idInteger that is unique to each function. This is 1 in the first function, 2 in the second function, and so on. There's no particular order, the numbers are likely in the order that the compiler finds the functions in.#func_countInteger for the total number of functions in the program.#func_nameString for the name of the function.#func_name_arrayArray of Strings containing every function name, indexed according to the function's #func_id.#unique(foo)Every time this is used in source code, it resolves into a different integer value for the given key. It starts from 0 and increments each time it's used. This is similar to __COUNTER__ in C except you have multiple counters by using different keys. To be clear, this is compile-time, the values do not change at runtime.Functions may be removed during compilation (e.g. non-exported functions that aren't used anywhere). Removed functions will have empty names in
#func_name_array, or may be removed entirely (in which case they also don't contribute to#func_idor#func_count).
Default function arguments
#function foo (int a, int b = 100, int c = 1337) int {
#return a * b * c;
}
// All of the following function calls are the same.
f32 hi = foo(5, 100, 1337);
f32 hi = foo(5, 100);
f32 hi = foo(5);
f32 hi = foo(5,, 1337);
Type functions
These are just normal functions with special syntax for accessing them. The first argument's type can be
#base_type(or a pointer to it), which will be treated in a special way.#function f32.add (&#base_type a, f32 b) {
*a += b;
}
f32 something = 3.14159;
f32.add(&something, 100); // Call the function manually.
something.add(100); // Same as above, the actual purpose of a type function is to use it like this.
Function overloading
Functions must have a unique name, but they can be overloaded separately. There cannot be a function with the same name as the overloaded name.
#function foo_int (int x) {
...
}
#function foo_float (f32 x) {
...
}
#overload foo foo_int;
#overload foo foo_float;
#function main () {
int x = 123;
f32 y = 1.5;
foo(x); // Calls foo_int
foo(y); // Calls foo_float
}This is similar to using a _Generic macro in C, the main difference is that with #overload you can define each overload separately.
Bounds checking
When you access pointers or arrays, the bounds will be checked at compile-time and you get an error if you overflow the length. If the index and/or length aren't known at compile-time, a bounds check is done at runtime to make sure the index is within bounds. Runtime bounds checks can be disabled with compilation options.
#index= can be used in the following conditions, this is substituted with the index that you access the array/pointer with.#bounds_check(condition)= modifier for pointers, the condition must pass or else you get an error. Can use any integer variables and constants, for example#bounds_check(#index <= foo*bar-8).#counted_by(x)= equivalent to#bounds_check(#index<x). This is just for convenience since this is the most common use-case.#always_check_bounds,#never_check_bounds= modifier for struct types and variables, can be used to override the compiler setting for whether to check bounds.#struct Coolarray {
int count;
&int data #counted_by(.count);
};
Coolarray a;
a.data = mem.alloc(10*#sizeof(int));
a.count = 10;
a.data[10] = 123; // Compile-time error: index 10 is out of bounds.
#function test_me (int count, &int data #counted_by(count)) {
data[10] = 123; // Runtime error: index 10 is out of bounds.
}
test_me(a.count, a.data);Many cases of bounds checking would be done at compile-time since the limit is resolvable at compile-time. If the check can be done at compile-time, it won't be done at runtime.
Arrays automatically have
#counted_by(#countof(a)).Bounds checking is enabled by default, but can be toggled on/off whenever, in a single file, in a single function, in a single scope...
Casting/converting variables
// Conversion (pick only 1).
f32 x = (f32 #convert)y; // Properly converts y into the closest equivalent x.
f32 x = (f32 #place_bits)y; // This just slaps the bits in without converting anything, this probably won't actually become a valid f32. This disables all the other rules below.
// float -> int rounding (pick only 1). NOTE: these only work when converting floats to ints, it does nothing for float-to-float or int-to-int conversions.
i32 x = (i32 #floor)y; // Will floor floats.
i32 x = (i32 #ceil)y; // Will ceil floats.
i32 x = (i32 #round)y; // Will round floats.
i32 x = (i32 #cut_decimals)y; // Will round towards 0 by removing the decimals, basically positive and negative values round differently.
// Range checking (pick only 1).
u8 x = (u8 #check)y; // Makes sure u8 can contain the information from y.
u8 x = (u8 #no_check)y; // This does not do aforementioned check.
u8 x = (u8 #clamp)y; // y will be clamped to the range of u8, so if y is 300, it will become 255.
// Lazy cast, casts to whatever type is appropriate, in this case into an int.
int x = (*)y;
int x = (int)y;
int x = (int #convert #no_check #cut_decimals)y; // Same as above because these are the default casting settings. They can be changed with compiler options.
Macros
A macro is a text replacement, it gets replaced with it's contents where-ever it is used. For the most part, macros work exactly like in C. TFD macros are local to their own scope.
#macro something "Global!"
#function foo () {
#macro something "Local!"
print(something); // "Local!"
}
#function bar () {
print(something); // "Global!"
}The arguments can optionally have types.
#macro something(foo, f32 x, f32 y) (x*y + foo)Unlike C, the arguments are captured more similarly to function arguments.
// C
#define foo(x) ...
foo({1, 2, 3}) // Error: foo takes 1 argument, but 3 provided.
// TFD
#macro foo(x) ...
foo({1, 2, 3}) // No problem.Multi-line macros can be made by "escaping" line breaks with
\, or by wrapping the contents inside#{and#}.#macro complicated_macro_1 \
#if (x) { \
foo(); \
bar(); \
}
#macro complicated_macro_2 #{
#if (x) {
foo();
bar();
}
#}Arguments can also be wrapped with
#{#}to more easily input arbitrary text/code into the macro.#macro funny_loop(condition, increment, inner_code) #{
#for (int i=0; condition; i+=increment) {
inner_code
}
#}
funny_loop(i<100, 2, #{
printf("This is a macro loop!\n");
printf("The number is {}\n", i);
#})
##can be used as a void space to isolate arguments without separating them with a space, it's mostly used to connect arguments to something else and dynamically creating names.#macro foo(x, y) 999##x##y
printf("I ate {} cakes.\n", foo(100, 50)); // "I ate 99910050 cakes."
#xcan be used to place an argument as a string.#macro foo(x) printf("{} = {}\n", #x, x)
foo(2 * 10); // "2 * 10 = 20"
#on_leave, #on_enter_func, #on_leave_func
#on_leaveis a special variation of #macro that automatically places it's contents everywhere that the scope ends.#function main () {
&void data = mem.alloc(1000);
#on_leave free(data);
#if (x) #return; // free(data) is inserted here.
#for (int i=0; i<100; i++) {
Thing* thing = get_thing();
#on_leave release_thing(thing);
#if (x) #continue; // release_thing(thing) is inserted here.
#if (y) #break; // release_thing(thing) is inserted here.
#if (z) #return; // release_thing(thing) and free(data) are inserted here.
// release_thing(thing) is inserted here.
}
// free(data) is inserted here.
}
#on_enter_funcand#on_leave_funcare also special variations of #macro, they automatically place their contents at the beginning and end of functions.#on_enter_func printf("Hello! {}\n", #func_name);
#on_leave_func printf("Bye! {}\n", #func_name);
#function testfunc (int x, int y) {
x += y * 2;
#if (x > 1000) {
#return;
}
printf("x={} y={}\n", x, y);
}The code above would equate to the following:
#function testfunc (int x, int y) {
printf("Hello! {}\n", #func_name);
x += y * 2;
#if (x > 1000) {
printf("Bye! {}\n", #func_name);
#return;
}
printf("x={} y={}\n", x, y);
printf("Bye! {}\n", #func_name);
}The above example is slightly misleading because if you do
#return foo(), the#on_leave_funccontents must be placed after the function call, so you can't just think of it as going before the whole return statement.This code will not be added to functions marked as
#inline.
Importing and building
To build a program, simply give the compiler a starting code file. All the options relevant to building a program must be defined through special global variables (list of options below).
There's no forward declaration or headers like in C, however files that contain macros or compile-time ifs may require special considerations (see #section). Code files are imported directly with
#importwhich gives access to that file from the current file, but unlike C #include which just copy pastes the file contents, the file is imported as a self-contained object.#import "coolarray.tfd";
Coolarray test;
init_array(&test);You can also import a file into it's own namespace:
#import "coolarray.tfd" ca;
Coolarray test; // Error: Coolarray is undefined.
init_array(&test); // Error: init_array is undefined.
ca.Coolarray test;
ca.init_array(&test);
#moduleworks the same way except the path is relative to library directories (mostly the compiler's standard library directory) instead of your project's directory. It also implicitly has #no_everywheres enabled, more about that in Name visibility. #module is meant for re-usable/third-party libraries, #import is meant for your project's files.#module "string.tfd";
String test = "Hello world";
#pastecan be used if behavior identical to#includefrom C is desired, it acts exactly like the file's contents were inserted here, which means you can use this multiple times.#paste "some_code.tfd";
#filecan be used to place the contents of a file into an array or string.[]u8 file = #file "something.txt";
#for (int i=0; i<#countof(file); i++) {
printf("Char {c}\n", file.[i]);
}All of the above can be used in any normal scope.
#function foo () {
#import "coolarray.tfd";
Coolarray test;
}
#function bar () {
Coolarray test; // Error: Coolarray is undefined.
}
#section and order of compilation
Since TFD has no header files, it must be compiled non-linearly. The compiler may run into a name that it doesn't know about yet (it's later in the same file, or in another file), so it has to defer that part until later.
However, while functions and types are order-independent, macros and compile-time ifs (#static_if) are order-dependent because they change how the code after them should be interpreted. When the compiler finds a #static_if, the condition must be solvable at that time, it cannot be deferred to later. Similarly, a macro will only substitute code that comes after it.
In order to tune the order of #static_ifs and #macros, the file can be split into sections with
#section. The sections must first be defined with#define_sections.#define_sections MACROS, BODY;
#import "stuff.tfd";
#import "things.tfd";
#section MACROS;
#macro lollercoaster 69
#section BODY;
#function main () {
printf("{}\n", lollercoaster);
}When the compiler encounters #import, it will move to that file. When it encounters #section, it will pause parsing the file as if the file ended. When every file has paused or ended, it will continue parsing every file that is paused at the next section, and continues this process until all sections in all files are parsed.
This allows 2 files to import each other and use macros from each other. The sections where the macros are defined get parsed from both files before the body from either file. You can define whatever sections you want, but you can only define them once and cannot change them later.
If you need more fine-grained control over which file is parsed first, you can use
#defer_compilerto defer the parsing of the current section of the file as much as possible, or#compiler_goto "foo.tfd"to manually invoking a switch to another file. These would mostly be used in weird cases where macros that affect #static_ifs are in the same section as the macros.TODO: Is it sufficient to just define all the macros at the top and then put #defer_compiler after them? Maybe #section is not even needed.
TODO: If file 1 imports file 2, and file 2 does defer, file 1 may not have the macros that it expected to have from file 2.
TODO: Is it useful or practical for the compiler to remember what #static_if resolved into, and give a warning if the condition changed after the #static_if was already parsed?
Name visibility
The visibility of every global name in a file can be controlled with 3 keywords which affect whatever is defined after them.
#publicnames are visible when someone imports this file. This is the default.#privatenames are not visible (unless this file is imported with #force).#everywherenames are visible to files that are imported from the current file, so it effectively injects the name into other files.#private #struct Thing {
int x;
int y;
};
#private #function test () {
print("Testing a thing {}\n", foo);
}
#private int foo = 123;#everywhere #module "string.tfd";
#everywhere #macro program_name "My awesome project"
// cool_array.tfd will implicitly import string.tfd, and the program_name macro will be usable there. The same is true for all files that cool_array.tfd imports.
#import "cool_array.tfd";
// boring_array.tfd will neither import string.tfd, nor have access to program_name.
#import "boring_array.tfd" #no_everywheres;
// #module implicitly has #no_everywheres.
#module "third_party_library.tfd";
// ...but you can change that.
#module "hackable_library.tfd" #include_everywheres;Note: #everywheres will get "baked in" to a file when the file is first parsed. #importing the same file again does nothing so you can't import it with different #everywheres later. #everywhere names are intended for project-wide settings and libraries that you want available everywhere, it's better to compare them to the -D compiler option in C, not to #define.
If some library is over-using #private or you want to get more access than was intended, you can forcibly get access to all the names by using
#force. This will treat all #private names as #public.#import "coolarray.tfd" #force;Functions can be linked from a pre-compiled library.
// This function will be visible if you compile an object file or DLL.
#function foo () #export { ... }
// The other side of #export: this function comes from a compiled library at the linking stage.
#function bar () #external;If you're creating a pre-compiled library, a header file will be needed to use it. For this purpose you can import a file with
#validate_header. This will compare all #external functions in the file with #export functions from the current file, and gives an error if there's a mismatch. It also compares types and requires types of the same name to be identical.#import "coolarray.tfd" #validate_header;
Build rules
These are similar to compile-time #everywhere variables that the compiler uses directly to control it's behavior. They can be changed at any time, and like macros, they're bound to scopes, so changing them at the start of a function will revert them at the end of the function, and putting one to the end of a file doesn't do anything.
The following options only have effect when used in the starting file, but the values can be read from other files.
#exe_name = "coolprogram.exe"; #exe_path = "release/bin"; #exe_icon = "res/icon.png"; #add_linked_library("Gdi32"); // Equivalent to -lGdi32 in GCC. #add_linked_library_path("/foobar/lib"); // Equivalent to -L"/foobar/lib" in GCC. #add_module_path("/foobar/include"); // Equivalent to -I"/foobar/include" in GCC.If you want custom values, use macros with #everywhere:
#everywhere #macro program_version 123
#everywhere #macro program_name "Cool Program"These affect how the program is compiled and what's in it. These are scoped, so you can change them inside an individual function or any scope.
#optimization_level = .MAX_SPEED; #runtime_bounds_checking = #true; #remove_unused_functions = #true; // Will delete any functions from the program that aren't called from anywhere else and that don't have #export. #remove_unreached_functions = #false; // Similar to above, but a search is done starting from the main function to check if functions are reached from it.These are default settings and rules that can be modified according to your preference. It's not recommended to change these, but you can. One of the core principles of TFD is that it's not the language designer's job to tell the programmer what's the right way to program. It can only make suggestions through the default values.
Unlike the settings above, libraries imported with
#modulewill have their own self-contained rules for all the settings below, unless you import it with#include_everywheres.#default_type_visibility = #public; #default_macro_visibility = #public; #default_function_visibility = #public; #default_global_variable_visibility = #private; #default_import_visibility = #private; #default_typedef_strictness = #abitstrict; #default_enum_strictness = #strict; #allow_int_signedness_loss = #false; // i32 -> u32 #allow_int_signedness_lossless = #true; // u8 -> i16 #allow_int_size_loss = #false; // u64 -> u32, u32 -> i32 #allow_int_size_lossless = #true; // u16 -> u32 #allow_float_size_loss = #false; // f64 -> f32 #allow_float_size_lossless = #true; // f32 -> f64 #allow_float_to_int_loss = #false; // f32 -> i32 #allow_int_to_float_loss = #false; // i32 -> f32 #allow_int_to_float_lossless = #true; // i8 -> f32 #auto_cast_from_void_pointers = #true; // &void -> &int #auto_cast_to_void_pointers = #true; // &int -> &void #auto_cast_from_array_pointer = #true; // &[]int -> &int #auto_cast_to_array_pointer = #true; // &int -> &[]int #bool_only_accepts_true_false = #true; // If true, bool will not accept integers, improves type checking when calling functions. #treat_true_false_as_int = #false; // If true, 'true' and 'false' can be used more freely, for example you can set them into an 'int' variable without casting. #default_casting_behavior = #convert #no_check #cut_decimals; #allow_lazy_cast = #true; // x = (*)y #completion_of_enums_on_cases = #partial_enum; // #partial_enum, #complete_enum #untyped_enum_size = .SMALLEST; // What size should enums be if they don't have a type. #aligned_members_must_match_struct_align = #true; // If you have a struct member with align(16) in a struct whose size is not divisible by 16, you get an error. #default_struct_packing = .POWER_LESS_OR_EQUAL_8_OR_8; // If the variable is 2 bytes, it's aligned to 2 bytes, if 4 then aligned to 4, if 3 then to 4, if bigger than 8 then to 8. .POWER always aligns to the nearest power of 2, for example 17-byte struct aligns to 32. .LESS_OR_EQUAL_X aligns to any size, not just powers of 2. #non_inferrable_string_macro = String; // In some situations the string type can't be inferred from context, in that case this string macro is used. #max_enum_member_names_count = 16_384; // If you use #enum_member_names and the array would be longer than this, you get an error. #max_enum_member_names_size = 1_000_000; // Same as above except byte size. #zero_initialize_arrays = #true; #zero_initialize_structs = #true; #zero_initialize_primitive_types = #true; #maintain_zeroed_struct_padding = #true; // If false, struct padding may be left uninitialized, and arbitrary data may be written to it, depending on what the compiler thinks is faster to do. #assume_zeroed_struct_padding = #true; // If true, some optimizations may be made, for example equality of structs can be checked across members even if there's padding in-between them. Programmer must ensure that uninitialized data isn't used for structs. #allow_stack_alloc = #true; // void* foo = #stack_alloc(x); #allow_variable_length_arrays = #true; // [x]int foo; where x is a variable, not a compile-time constant. #allow_assignment_in_conditions = #false; // if (x = foo()) ... #allow_assignment_in_statement = #false; // x = foo[y=foo()]; #allow_increment_decrement_in_conditions = #false; // if (x++) ... #allow_increment_decrement_in_statement = #false; // x = foo[y++]; #allow_redeclaration_of_name_from_parent_scope = .YES; // .YES or .NO or .SAME_TYPE_ONLY #allow_redeclaration_of_name_from_same_scope = .NO; // .YES or .NO or .SAME_TYPE_ONLY #allow_unused_variables = #true; #allow_unreachable_code_after_return = #true; #allow_unicode_in_comments = #true; // /*🔥*/ #allow_unicode_in_strings = #true; // "🔥" #allow_unicode_in_character_literals = #true; // '🔥' #allow_unicode_in_code = #true; // int 🔥 = 123;#enforce_indentation_character = .NONE; // .TABS or .SPACES #enforce_indentation_length = 0; #enforce_open_brace_placement = .NONE; // .SAME_LINE or .NEW_LINE #enforce_else_placement = .NONE; // .SAME_LINE or .NEW_LINE #enforce_close_brace_placement = .NONE; // .MATCH_OPEN_BRACE_LINE #enforce_unbraced_code_placement = .NONE; // .SAME_LINE or .NEXT_LINE_PLUS_INDENT or .FORBID, this refers to foo in if (x) foo; // UPPER_CASE = 0x1 HELLOWORLD // LOWER_CASE = 0x2 helloworld // BEGIN_UPPER = 0x4 Helloworld // BEGIN_LOWER = 0x8 helloworld // FORBID_UNDERSCORE = 0x16 // example: .UPPER_CASE|.BEGIN_LOWER = hELLOWORLD #enforce_struct_capitalization = .NONE; #enforce_enum_capitalization = .NONE; #enforce_macro_capitalization = .NONE; #enforce_function_capitalization = .NONE; #enforce_variable_capitalization = .NONE; #enforce_typedef_capitalization = .NONE;
Miscellaneous modifiers and mechanics
If some C attribute equivalent isn't here, then I probably don't normally use it. That doesn't necessarily mean it shouldn't be in the language, I just never think about it. I'd need someone more knowledgeable about assembly and compiler optimizations and stuff to tell me what kind of modifiers and features are useful.
#inline,#noinline- #inline functions behave identically to macros in that it replaces the function call with the function's contents. Unlike C, inline functions will always inline, if it cannot (for example if it recursively calls itself), it will give a compiler error. In C, theinlinekeyword is just a suggestion and isn't guaranteed to do anything. The compiler may inline functions that aren't marked as #inline if it thinks it's a good idea, #noinline prevents that. You can also use these from the calling site.#function foo () #inline { ... }
#function bar () { ... }
#function main () {
foo();
#inline bar();
}I don't know much about linkers, so I don't know how inlining of external library functions works. I don't really care to be honest, I feel that the benefit of inlining linked functions is lesser than the benefit of inlining function code because the latter has much greater potential for optimization.
#must_receive- If the caller of a #must_receive function doesn't receive the return value, you get a compiler error. Useful when the function returns some allocated memory/object that's expected to be freed from the outside.#align(x)- Can be used for type definitions or variable definitions, aligns it in memory. Same syntax as #offset() except this can be a modifier to a type too. Can also be used for values behind pointers, for example& f32 #align(32)would mean that the pointer address is aligned with 32 bytes, allowing the usage of 256-bit SIMD operations on it. May cause a crash if the address isn't aligned.& #align(32) f32would mean that the pointer variable itself is aligned, but it's value may not be. Unlike most modifiers whose location is not important, this modifier must come immediately after the thing being aligned.#persist- Variable in a function whose value persists across function calls, i.e. a global variable that's only accessible from the function. In C you would usestatic.#thread_local- Variables marked with this are unique per thread.#read_only- For variables, the data cannot be written to. Basically the same asconstin C.#warning"x",#error"x"- Can be attached to functions, macros, types, global variables, enum values, or globally to a file. Causes the compiler to give a warning or an error if they are used, useful for things that are deprecated or unfinished or broken.#stack_alloc(x)- Similar toalloca()in C.#typeof(x)- Expands to the type of x.In C,
++and--work differently from+=1and-=1. In TFD they're different syntax for the same thing.&int foo;
*foo += 1; // Increments the int.
*foo ++; // Increments the int. In C this would shift the pointer and dereference for no reason.A semicolon after most braces (functions, ifs, loops...) are optional and will not do anything.
Additional thoughts and ideas
Here goes stuff that I'm unsure about, or otherwise think is worth mentioning.
-
There's value in keeping the language simple. The easier it is to implement the compiler, the less likely it is for the language's survival to be dependent on a single compiler author/source. It makes me feel more secure.
I don't think it's bad to have lots of modifiers for everything (especially if the program can be compiled and works even if the modifier doesn't do anything, such as #inline for functions), but it is bad to add features that are complicated to implement.
It's also useful if it's easy to parse/analyze parts of the source code, even if you don't want to implement a compiler for the language, so ideally the syntax should be easy to interpret.
-
Struct member expectations (tagged unions). It could be possible to give struct members a condition, and any scope/function where that condition is used will check for the condition and gives an error if it's false.
#enum TYPE {
FOO;
BAR;
ZIP;
};
#struct Thing {
TYPE type;
int x #expects(.type==.FOO);
int y #expects(.type==.FOO);
f32 a #expects(.type==.BAR);
f32 b #expects(.type==.BAR||.type==.ZIP);
};
#function do_the_thing (Thing thing) {
#if (thing.type == .BAR) {
thing.a = 1.5;
thing.b = 0.1;
thing.x = 123; // Error: .x expects (.type==.FOO), but checked that (.type==.BAR) on line 15.
}
}I'm not sure how useful would this actually be in practice, and whether there's a better way to do this. There has been at least one time where I had a bug caused by using the wrong thing from a union.
This would work similarly to bounds checking in that it happens at compile-time whenever possible, and you can turn off runtime checks.
-
Variable groups. I'm about 85% sure that this should be in the language, but I haven't had enough time to feel it yet, and there might be a reason why it's not as simple as it looks.
#function test (int x) int, int {
#if (x > 1000) {
#return 0, 0;
}
#else {
#return x*10, x*20;
}
}
int a, int b = test(50);
int c = 15;
int d = 200;
c, d = d, c;
d, c = test(a);This is basically a less verbose version of unnamed structs. Since unnamed structs can work, I imagine there isn't a reason why variable groups couldn't, unless there's some case of ambiguous syntax somewhere.
#function test (int x) #struct { int; int; } {
#if (x > 1000) {
#return {0, 0};
}
#else {
#return {x*10, x*20};
}
}
#struct { int a; int b; } = test(50);
printf("{}, {}\n", a, b); -
#undefine for removing macros and allowing you to re-define them (same as #undef in C)?
-
Are the names of
#importand#modulebackwards? Maybe they make more sense the other way round. -
#endif. Sometimes it's annoying how switch increases indentation by 2 levels, and trying to lower it just feels weird because of the braces.
#if (foo) ...;
#case 1;
print("1\n");
#case 2;
print("2\n");
#case 3; {
int foo = 123;
something(foo);
print("3!\n");
print("This is a pretty cool number.\n");
}
#case;
print("Idk what's happening.\n");
#endif;You could maybe also use it for normal ifs, but I'm not sure what the utility would be.
#if (foo == bar);
int foo = 123;
something(foo);
print("Idk what's happening.\n");
#endif;I'm not 100% sure if this feature is really worth having though, there's something about it that feels off.
-
#typeid()
int type_of_int = #typeid(int);
int type_of_float = #typeid(float);
int type_of_Foo = #typeid(Foo);Returns a different integer value for each type in the program.
I'm unsure about this feature because I don't know how/if it would work if you link with pre-compiled libraries. I'm not familiar with how linking works, but it shouldn't be hard to add a table of type ID locations into the pre-compiled library and just swap them out during linking. Maybe the hardest part is matching type conflicts, like how do you know that Foo in library 1 is the same type as Foo in library 2?
Even worse, this probably cannot work properly with dynamically linked libraries, you'd have to use some weird runtime translation or type info tables or something, and that sounds way more complicated than what I want.
-
I want better variable function arguments, but I also don't want to make functions like printf more expensive by making it more internally complicated and bloated. If the function can operate with minimal information then the compiler shouldn't set up a bunch of crap that the function doesn't need. I'd have to know more about how variable arguments are implemented in C to be able to design a better system.
An example of what I'd want is basically sending a struct that has a count, pointer to a list of type IDs, and pointer to a list of pointers to the values.
#function print (#args_array args) {
#for (int i=0; i<args.count; i++) {
#if (args.typeids[i]) ... {
#case #typeid(int);
&int value = args.values[i];
#case #typeid(Vec2f);
&Vec2f value = args.values[i];
#case;
&void value = args.values[i];
}
}
} -
I don't have a full picture of how this works at assembly level, but it may be necessary to control whether it's valid to access variables from the parent scope from a callback.
#function main () {
int count = 0;
#function callback () {
count ++;
}
do_stuff(callback);
}This is a bit ambiguous and weird and possibly error prone (especially if it dereferences pointers) because you have no idea where this callback is going, and it may be stored and called later, and you can't just use fixed stack memory offsets from a callback like this.
If the nested function is only called directly from the parent function, then it's a lot more straightforward, it's the callbacks that are problematic.
Perhaps function pointers could have a
#synchronousmodifier which communicates that the function is not stored long-term nor called asynchronously. Without that property, you can't use variables from parent scope. Or #asynchronous to invert the default.#function do_stuff (&#function callback ()) {
callback();
}
#function do_stuff_sync (&#function callback () #synchronous) {
callback();
}
#function main () {
int count = 0;
#function callback () {
count ++;
}
do_stuff(callback); // Error: callback may be asynchronous, variables from parent scope cannot be used.
do_stuff_sync(callback);
}I'm trying to design something without full understanding of how it works though, so maybe this is all nonsense.
-
C's text replacement -type macros are known to be problematic for various reasons. TFD improves them slightly, but some of the fundamental problems like the difficulty of debugging them aren't fixed. The good thing about them is that they're very simple, which keeps the compiler simple, and doesn't require the user to know much about them. They're also extremely powerful because they can generate almost any syntax.
Ideally I would prefer a more structured macro/metaprogramming system, but I don't really know what that would look like. C-style macros combine extremes of capability and simplicity, it's very very hard to compete with them. Inline functions and compile-time functions can solve some of the things macros are needed for, but it's not quite enough.
An idea I'm most interested in is compile-time scripting. The easiest example is a function that runs and returns it's value at compile-time:
#function wowza (int x, int y) int #compile_time {
#for (int i=0; i<10; i++) {
x *= y;
}
#return x;
}Ideally this should be able to do almost anything that a normal function can. Types and maybe even variables can use #compile_time.
-
Data segments. I'm not too familiar with them, but there should be a mechanism for deciding where some data is stored, and what it's properties are (real properties like whether it's read-only and what it's size is, and fake properties like how data in it should be aligned, or whether it's hot or cold which could be used by the compiler for optimizations). My knowledge starts and ends at the introduction of the wikipedia page which seems incomplete to me. String macros should probably also be able to control where their data is stored by default.
Ideally you should be able to create arbitrary data sections that you can read/write from as you please. You could create a big u8 array for that purpose, but that sounds like a hack compared to having a blank data section that you can get a pointer to.
Perhaps you could give the compiler some kind of layout of your desired data sections and their organization.
There's probably a few things I would know how to design better, or things I would change my mind on, if I was more knowledgeable about assembly and how programs are structured.
-
There's a trick I sometimes do with C arrays, it still works but I'm bothered by the fact that TFD doesn't improve it: sending variable length array literals to functions (with 0 at the end to denote the end of the array). In fact this requires 1 extra character than C so in a way this is a downgrade.
#function foo (&int a) {
#for (*a) {
printf("Number {}\n", *a);
a ++;
}
}
foo(&([]int){1, 2, 3, 0}); // Works but ugly and annoying to write and read.
foo(&{1, 2, 3, 0}); // This would be much better, but the function doesn't recieve an array so this can't be interpreted as one (if the int was some struct then this would look like a pointer to the struct). -
I want 16-bit floats, but I'm unsure what their status is as far as CPU support goes. It would be a bit weird to have that type if it's not well supported and consistent. But if it's left out and the user must implement them, then the language should also have operator overloading. I haven't thought about operator overloading much, maybe that should be in the language too.
Operator overloading seems like one of those features that's fine when you do it, but really annoying when some library uses it.
-
I may want to get rid of the ternary operator syntax and replace it with just an if/else.
int foo = #if (bar) x #else y;
int foo = #if (bar) x else if (zip) y #else z;It's somewhat annoying to look at this when you're used to the old syntax though:
int foo = (bar) ? x : y;
int foo = (bar) ? x (zip) ? y : z;There's something attractive about the idea of using an if/else. The old syntax is kind of weird and only exists for this one purpose, and you could easily replace it with the same syntax that you use everywhere else. The old syntax also uses 1 or 2 symbols that aren't used anywhere else, you could be re-purpose them for something else (I want to use
:foo:for labels, but?isn't used by anything). -
Use case for unused symbols?
$@. -
There would be value in some kind of tagging system that could be used to toggle build rules and control behavior in certain ways, or do some kind of introspection/metaprogramming. For example something like this:
#zero_initialize_arrays = #false;
#zero_initialize_structs = #false;
#zero_initialize_primitive_types = #false;
#optimization_level = .MAX_SPEED;
#set_property = #inline;
#function crunch_some_numbers () {
...
}#on_enter_func { profiler_start(rdtsc(), #func_id); }
#on_leave_func { profiler_end(rdtsc(), #func_id); }
#function test_one () {
...
}
#function test_two () {
...
}It could also be used for types, maybe even local scopes. Or just as a generic way of toggling settings and modifiers for any other thing in a more convenient and coherent way. You could use #static_if to change what the tag does.
I haven't thought about this idea very much so I'm not 100% sure about the details of how it would work.
-
There's some merit to a type called
charbecause it communicates that something is meant to be text better thanu8. However, from my experience it doesn't really work that way in practice. I often want to just read/write text from/to binary chunks, and UTF-8 is a variable-sized format, so I usually end up thinking of text as a blob of bytes, not as an array of characters. The String type communicates that something is a string anyway, this is mostly a concern in C which doesn't have any explicit string types. -
There's a few reasons for prefixing signed integer types with
iinstead ofs. Firstly, you rarely think of the words "signed integer", I think it's more common to think of just "integer" or "int" which starts with "i". Meanwhile when you think of an unsigned integer, you specify with the word "unsigned" which starts with "u". Secondly, the letter "s" also reminds of "string" and "struct", while "i" doesn't really remind of anything else. When I see "s32", it immediately makes me think of some kind of string. I might want to make a string macro that uses a 16-bit integer for it's length, that's what "s16" sounds like to me. -
I originally wanted structs and enums to not need a semicolon, but there's a problem with unnamed enums:
// Anonymous enum, semicolon needed because this is a variable.
#enum {
FOO;
BAR;
} foo;
// Named enum, semicolon not needed, the compiler knows from the name that this is a type.
#enum SOMETHING {
FOO;
BAR;
}
// Nameless enum.
#enum {
FOO;
BAR;
}
x = 2; // Is this supposed to be an anonymous enum like the first example, or an assignment to another variable? -
It's always annoying when you want to use a name but it's reserved by something else. I seem to run into this problem consistently enough that I decided to prefix ALL keywords with
#, a bit like an inverse of PHP (where all variables are prefixed with $, which I find extremely annoying to code with).This is also part of the reason for renaming
switch()intoif()..., replacingdefaultwith an emptycase, and removing theunionkeyword (the main reason is because I want to think of structs and unions as the same thing). I have a desire to replace#case Xwith some kind of symbol, such as... Xor-> X, however I like the fact that#casehas similar syntax highlighting as#ifso I probably won't change it.I'm concerned that basedefs.tfd creates 2 separate worlds for this language, one where raw syntax is used, and another where basedefs is used (plus a third world where the user defines their own names). I want to also prefix primitive types, but I really really really really hate the idea of #-prefixed types ever being used in code. People might be encouraged to use them if they prefer the raw syntax or otherwise don't like basedefs.tfd.
Another thing I'm interested in is to define primitive types in a completely different way, for example
#int(32, signed). The main reason is because this would make it more natural to define precisely the types you want including obscure special types for embedded devices, and would not be beholden to any architecture assumptions. Of course, the mere idea of someone using types like this directly makes me want to delete this entire page and pretend I never designed a language. Maybe this would not be an actual type, and you can only use it for a #typedef. -
Something I've always wished I had was global error numbers. The value could flow all the way from some inner function to top-level code and retain it's meaning. One interesting way to enable that would be to allow enums to be expanded dynamically, you could even use #enum_member_names to print the error name.
#enum ERRNUM {
NONE;
};
#function fooler () ERRNUM {
#if (x) #return #unique_enum(ERRNUM, BAD_THING_HAPPENED_IN_FOOLER);
#if (y) #return #shared_enum(ERRNUM, MEMORY_ALLOCATION_FAILED);
#return .NONE;
}
#function bar () ERRNUM {
#if (x) #return #shared_enum(ERRNUM, MEMORY_ALLOCATION_FAILED);
#return .NONE;
}
#function main () {
ERRNUM e = foo();
#if (e) {
[]String error_names = #enum_member_names(ERRNUM);
printf("Error occurred: {}\n", error_names[e]);
}
}This could potentially have uses in other places, for example plugins/customizability/extensibility, like adding a new UI module to a UI library.
I'm not sure if this is a good idea, it certainly sounds like it would complicate the compiler since no value can be assumed to be invalid until all files in the program have been parsed (the value may come from an expanded enum). Maybe the language should just have a hard-coded error type that can be expanded.
-
Not exactly important, but ideally it should be possible to associate every major name and type with a single letter.
Struct collides with String, could be fixed by renaming Struct into Plex. Function collides with Float, but Procedure would collide with Plex, but there's no other good names for it. Struct could be Class for cultural reasons, but not only is it bad at describing what it does, it's also a word that definitely will be in user code so it would become impractical to remove the # prefix. Group would be super descriptive, but even worse than Class for user code.
Boolean Define Enumerator Float Function Integer Macro Procedure Plex Pointer String Struct Signed Type(def) Unsigned Union -
How practical is it to use named function arguments without requiring a struct?
#function something (#struct { int amount = 1; f32 foo = 0.5; bool do_the_things = #true; bool ignore_dumbness = #false; } args) { ... } something({15, 3.14}); something({.foo = 3.14, .ignore_dumbness = #true}); #function something ( int amount = 1, f32 foo = 0.5, bool do_the_things = #true, bool ignore_dumbness = #false ) { ... } something(15, 3.14); something(.foo = 3.14, .ignore_dumbness = #true);
-
TFD stands for "Tool For Doing".
Syntax highlighting test.
Comments: /* There's something to say. */
Strings: "Hello world!"
Primitive literals: 12345, 'X', 0xffff00ff
Named constants: #true, #false, #null, #func_id ...
Control flow: #if, #for, #case, #break, #return, #goto ...
Macros and compile-time: #macro, #import, #static_if ...
Types/names/constructs: #function, #struct, #enum, #typedef
Modifiers: #private, #inline, #pack(1) ...
Adding modifiers to return types is complicated, is there even a need for it? #must_receive could be a function modifier, although that won't allow specifying it for one of multiple return values.
#function something (Vec2f pos, Vec2f size) &int #inline { ... } #function something &int (Vec2f pos, Vec2f size) #inline { ... } #function something (Vec2f pos, Vec2f size) &int #const #inline { ... } #function something &int #const (Vec2f pos, Vec2f size) #inline { ... }
#function something #inline (Vec2f pos, Vec2f size) &int { ... } #function something #inline &int (Vec2f pos, Vec2f size) { ... } #function something #inline (Vec2f pos, Vec2f size) &int #const { ... } #function something #inline &int #const (Vec2f pos, Vec2f size) { ... }
#inline #function something (Vec2f pos, Vec2f size) &int { ... } #inline #function something &int (Vec2f pos, Vec2f size) { ... } #inline #function something (Vec2f pos, Vec2f size) &int #const { ... } #inline #function something &int #const (Vec2f pos, Vec2f size) { ... }
#function something &int #const, &int #const (Vec2f pos, Vec2f size) #inline { ... } #function something (Vec2f pos, Vec2f size) &int #const, &int #const #inline { ... }