This is a more in-depth explanation of how memory management in Lobster works, and is typically not needed to be understood fully to use the language.
It may be interesting to those wanting to implement a similar scheme in another language.
Memory management is an aspect of a language that has one of the biggest influences on how a language turns out: it affects the type system and the kinds of types you can have, it affects efficiency in both time and space, it affects the cognitive model of the programmer in what data structures can be represented, it affects latency, interoperability, and much more.
Yet, it is often ignored and almost invisible at the same time. Many consider it to be a "solved" problem with probably 95% of programming languages out there using some form of garbage collection: allocate, then worry about reclaiming unreachable objects later.
I think it is far from solved. If I'd have to make a top-10 of "least elegant algorithms in computer science", the #1 would be an easy pick. The idea that just because "who is pointing to this object" is a hard question we are not going to even try to track it, and instead repeatedly scan large parts of the heap to recover that information later, really makes me cringe.
The problem is maybe that for the longest time, the alternatives have not been great. Low level languages would use manual memory management (efficient but error prone), and some have used reference counting. Then there were languages based on linear types or regions, but those never entered the mainstream.
I've personally always thought of reference counting as the "least bad" solution, which is why it was initially used for Lobster. Many people dismiss it out-right because it can't collect cycles, but I've found that, in practical use, getting a "cycle report" at program exit that helps the programmer know where to break those cycles, is largely sufficient.
I mentioned linearity (every object can only have one pointer to it) and regions (every object is assigned a memory pool that outlives them all), and I have implemented languages in the past that implemented both of those strategies. While viable, I felt at the time they required too much programmer assistance to be mainstream.
There is a memory management design space which I'll collectively call "Ownership" which is a generalization of the linearity models: one pointer always owns an object (and is responsible for its deallocation), and others merely "borrow" it.
This now poses new questions: How do we determine who is the owner? When is borrowing allowed? How do we detect a borrower that outlives an owner? What about an owner being modified while there are borrowers? Do we check any of these at compile time, runtime, or a mix?
There have been many ownership models in the past, but only recently they are becoming more mainstream:
std::unique_ptr
is essentially an
ownership model, except one that answers questions like "How do we
detect that a borrower outlives an owner" with "We don't". But even with
such caveats, it largely works well, and is a huge improvement over the
manual memory management of C and even the manual RAII of earlier
C++.I personally think this general direction is the future of memory management. It just needs to be made more friendly than Rust for most programmers to want to come on board. Lobster offers a possible direction.
This has two parts, its by-value model and ownership analysis.
The best memory management is.. not to have to manage it at all! You can make an incredibly fast memory manager, but it will never beat simply having less objects to manage. This is the thinking behind Lobster's by-value structs, which offer similar "zero abstraction cost" objects to ones available in C/C++, Rust, and even C#, but surprisingly not in most other languages. It is the most effective and cheapest way towards more efficient memory use that frankly every language should have.
Essentially, you declare an object with the struct
keyword instead of the usual class
, which causes Lobster to
always allocate this object inline in the parent, where "parent" can be
another struct or class, a vector, or simply the stack (as a local
variable). On an implementation level, it is as if you added all the
fields of the struct as fields/variables to the parent, so writing
s.x
(where s
is a struct) is just as cheap as
writing x
. The struct disappears.
This of course has limitations, in that you're completely dependent on the lifetime of the parent, and that you assign these kinds of struct by copy, rather than by pointer. That means it is not great for large objects, but usually up to 4 fields or so this is still more efficient than heap allocation.
Currently, such structs are immutable, since that simply makes sense for small, copied objects. This is likely to be relaxed in the future to also allow mutation.
There are situations where rather than copying the implementation could use a short-lived reference instead, like is common in C++. This is however not actually faster in many cases, and more importantly, a reference would be subject to the ownership analysis below. By making them copies, we essentially "reduce pressure" on that algorithm, meaning it is more likely to produce optimal ownership assignments and less errors.
Having these by-value structs is especially important in a language like Lobster that has so far focussed on the areas of games and computer graphics, which make intensive use of 2D/3D vectors. Having such types be as cheap as possible is important.
Lobster combines its original (runtime) reference counting with a ownership analysis algorithm, to get "compile time reference counting".
This is a fully automatic algorithm that mostly does not require programmer intervention. In my personal testing, working with several dozens of Lobster programs (many of which medium sized game prototypes) I needed to make minor changes in only 2 locations when transitioning from runtime to compile-time reference counting, the rest "just worked". Or rather, I was able to make the algorithm jump through hoops such that code changes weren't necessary.
Essentially, the algorithm picks a single owner for each new heap allocation, which is usually the first variable, field, or vector element it gets assigned to, and then tries as best as possible to make all uses from there on "borrows". The initial ownership and all borrows do not require any runtime reference counting. If somewhere else in the code something wants to own that same value, it will insert a reference count increase only for this particular use (in Rust, this would cause an error instead). Using this analysis was able to remove around 95% of runtime reference count operations.
This is the default behavior, since it is "good enough" of an optimisation for most code, and very convenient for the programmer. For those cases where the programmer wants maximum control however, they can opt-in to a more Rust-like model on a per variable basis, which will guarantee there will only be one owner ever, and error out if the analysis is not able to guarantee this. This will also be able to remove the reference count field from objects entirely, a further optimisation. This has not been implemented at the time of writing, however.
These are errors that may happen as a consequence of ownership analysis, though all of them have proven to be very rare.
cannot assign to x while borrowed
This can be caused by code like:
var x = "hello"
def f(y):
x = "world"
print y
f(x)
Here, y
borrows x
, then f
tries to overwrite x
while still being borrowed, which
would cause it to deallocate "hello"
while y
is still pointing at it. It is pretty rare for code that uses a borrowed
value to be accessing the original value also. As I mentioned, this only
happened twice in a large volume of test code, and is easy to avoid.
cannot assign to borrowed argument: x
This typically never triggers, as arguments that are assigned to are typically forced to be owners, but there are currently some exceptions related to dynamic dispatch where this may still happen. Likely will be eliminated entirely in the future.
variable x still has 1 borrowers
x used in <feature> without being borrowed
These two I've never even seen happen, they largely exist because I am not smart enough to write a formal proof that they can't happen, and then ensure that my code matches the proof :)
The "ground truth" for how the algorithm is to be found in
typecheck.h
, this section is merely a description of it. If
you were hoping for pages of equations like the typical PL paper, I am
going to have to disappoint.
More on the type checker here.
Wait, what, the type checker? Yes, as much as I would have preferred to make the ownership analysis a standalone algorithm, it is interwoven with type checking. The reason for this is that a lot of the power of Lobster centers around its "Flow Sensitive Type Specialization", meaning it type checks functions in call-graph order, and specializes them based on types. As it turns out, to most optimally remove reference count operations, we want to specialize on ownership as well. Making it a separate algorithm would mean duplicating a lot of this logic, so I decided to interleave them.
The ownership analysis works differently from most other languages in that rather than working just on variables and other storage locations, it works on all values in the language.
To be more precise, every AST node has a ownership "kind" it expects of its children, and an ownership kind it passes to its parent. The core of the algorithm is thus the matching that takes place between child and parent:
let a = [ 1, 2, 3 ]
the vector wants to be
owned, and the variable wants to own, everybody is happy. No Reference
counts touched.let a = b
,
b
wants to borrow (lend) itself, but a
wants
to own, so this has to be reconciled.print [ 1, 2, 3 ]
like most built-in functions
print
defaults to only wanting to borrow the value, and the
vector wants to be owned. So instead it gets held on to and deleted at
the end of the scope.Owned values are simple.. these are values that someone needs to be responsible for at all times.
Borrowed values are a little more complicated. Whenever an expression returns the value of a variable, field or vector element we want to default to borrowing these values, since the L-value already owns it. We have a stack of currently active L-values, and the ownership kind returned by the expression refers to these stack elements. These stack values are also reference counted, as multiple borrows may be active at once. This reference count decreases as borrowed values get "consumed". Mutating a borrowed L-value with a reference count > 0 produces the above error. The reference count not being 0 at the end is a bug in the ownership analysis :)
Note that variables currently always own. This was done for simplicity, as in theory a variable could be made to either own or borrow depending on the needs of its initial RHS, but this was awkward in cases where the definition and subsequent assignments disagreed, and generally made ownership analysis more difficult. This could be revisited in the future once the consequences of the algorithm are better understood, or maybe when more accurate dataflow information is present.
The end of a variable's lifetime is typically the end of the scope,
though making it its "last use" is being worked on, see
return
below.
Function ownership specialization is probably quite unique to Lobster, and solves an important problem: if different callers to a function have different ownership kinds that would be ideal for their arguments, the compiler (or worse, the programmer) would have to pick one, and let the others be suboptimal. Not only does this produce unnecessary reference counting, it reduces the amount of code that could otherwise work error-free in the context of strict ownership. For example:
def f(x): print x
f([ 1, 2, 3 ])
f(a)
If x
had been fixed to own because of the first call,
the second call would have been less efficient, or even an error. Now
they both get to be maximally efficient because of specialization. Now
in this simple example it is easy to see that x
should
really borrow, but it isn't always that trivial, and would need a more
complex ownership analysis that employs "logical variables" much like
type inference currently does (assuming you want to do this without help
of the programmer).
There are currently some exceptions to this:
Much like variables, some of these could be relaxed/improved in the future.
These are currently how the different AST types interact lifetime wise with their children (wants) and parent (results).
Note that most of code also uses a ownership kind of "any", which means either that the ownership doesn't matter (for e.g. scalars) or that the recipient is cool with any kind of ownership.
if
results in the union of the ownership kinds of both
branches. The union is simple if both are equal or either side doesn't
care. If they differ, it defaults to own. It has specialized code to
detect if one of the branches never returns (has a return statement in
it that escapes the if
) in which the ownership kind is that
of the other branch. The condition wants to borrow. if
with
one branch of course does not result in anything.switch
is similar to if
, but currently
always defaults to owning the value returned by the cases, since doing a
correct ownership kind union between many cases in the context of
branches not returning is harder. This should be improved.and
and or
are again similar to
if
, but have some special cases because sometimes a value
is guaranteed ignored (a and b or c
never results in
a
) or values are coerced to bool
when they are
incompatible ("a" or "b"
results in a string that wants to
be owned, but [ 1 ] or "hello"
results in a
bool
with the values immediately deleted).while
borrows its condition and doesn't care about the
body.for
borrows what it iterates over and doesn't care
about its body. It produces an element variable that wants to be owned.
Note: this may change in the future to also allow borrowing.+
or many builtin functions) and thus
produce a lot of reference count churn. To avoid this, the VM allocates
strings on first use, but never lets the reference count drop to 0,
meaning they can safely be borrowed, and reduce total reference
counting. Or more simply: it's as if there exists an anonymous global
variable for each string constant that's actually used, and which you
borrow from.+
on strings wants to borrow, and returns owned. This
is the same in principle for all binary operators, but thanks to "inline
structs" this doesn't matter for most of them anymore.push
wants to own its second argument, and top
returns a borrowed value.return
typically wants to own what it
is returning, but currently makes an exception for the frequent case of
returning a single variable, where it pretends to borrow it. This is
because otherwise it would typically cause a reference count increase
followed by a decrease as the variable goes out of scope. Extending the
cases where this happens, and also generally allowing the last use of a
variable to be owned by its last use rather than waiting for the end of
the scope, is planned as a future improvement.