Recently, "to C or not to C" became a topic on HN, which is a nice
excuse to spend couple hours on ABC retrospective. The decision
to work in C was rather natural: the author is a C/Go, not C++/Rust
kind of person, so once Go runtime became a problem, C was the
most straightforward answer. The dirty secret of both C++ and C is
that these two are like IKEA or LEGO languages. Languages to create
other languages. For example, virtually any serious C++ user has
some sort of alternative standard library (Abseil, QT, there are
many). You don't use C or C++ as-is, normally. C standard library
is small by design, so that is inevitable for most use cases.
C++/C standard libraries are sort of a very mixed bag, effectively
a chronicle of CS ideas for the last 40-50 years. If C standard lib
is kind of a manuscript chamber in a faraway monastery, C++ std lib
is more like the Library of Congress. Nobody knows it all, and most
of the ideas written are definitely not recommended today.
Abstractionless C resulted from many frustrations with C++ and its endless quirks. I needed generics, STL-like containers, disk and network serialization, some standard algorithms, with no pointer arithmetics and no malloc/free headaches. Coming from Go, I clearly needed slices. That was the pragmatic problem statement. Things to improve productivity while doing systems-programming.
On the higher philosophical level, I wanted to avoid the cursed
tower-of-abstractions trap that I felt quite sharply in C++.
There, same bytes packaged differently become entirely different
incompatible entities (like std::string vs std::vector<char> vs
std::valarray<> etc). I understand quite clearly what happens on
the bit and byte level. Lawyering about pure abstractions always
felt counter-productive to me, and C++ always had lots of that.
Many of those abstractions abstracted away things that do not exist
anymore, like big-endian CPUs and HDDs.
I did not want to play Jenga with imaginary bricks.
So the set of architectural choices was:
gives serialization for free (u32, i64, sha256, etc).
is typedef u8* u8s[2]; and a slice is non-owning.
ring buffer logic or ptr/len/cap constructs is built in.
for STL-level containers: HEAPu64Pop(), HEAPu8csPop(), etc
Vectors, heaps, open addressed hash maps, LSM sorted sets, these are fundamentally arrays.
void SHA1Sum(sha1* hash, u8csc from) declared in SHA1.h,
implemented in SHA1.c, tested in test/SHA1.c, etc.
mmap for solid containers.
u8csb is a buffer-of-const-byte-slices. sha256bMap() mmaps
a buffer of hashes, which might be treated as a vector, a heap,
or a hash set, e.g. with HASHsha256Put()/HASHsha256Get().
Slices and generics are a bit unexpected in C, the rest is just another C style with a funky notation, no biggie. The obvious issue here is that C does not support slices in any of its standard APIs. But, the C standard library is not that huge, and its usable part is even less, so unless a function is a syscall or somehow gets special treatment from the compiler, what is the value of it? Diminishingly zero. Especially in the LLM era. What has a lot of value is the toolchain that understands C and the OS kernel. Those are true megaprojects.
So, I sketched some skeleton of my (un)standard lib and started working with it. The "meat" slowly grew, the thing saw one or two refactors along the way, but it mainly remains a collection of small and focused modules with slice-based APIs and increasingly rare malloc use. The cases for malloc go down for the following reasons:
so you deal with u8cs (two-pointer slice) and the bytes
live in the arena,
matches on-disk layout, forget SPARCs and Alphas already),
malloc or something else.Out of remaining burning questions one may mention package and dependendency management. Obviously, for C that is RPM, APT, apk, Brew and so on. I am not going to bring along second copies of CURL, libsodium, and all the other usual suspects.
So for my purposes, it worked out fine in a 100KLoC project. As L.Torvalds once said: "Standards are paper. Buy some and write your own." Or something like that.