@hackage reference-counting0.1.0.0

A reference counting library to alias linear resources

General idea

You can create an alias to a resource, and then share that alias multiple times. You can forget aliases of resources. When the last alias is forgotten, the resource is freed with the given finaliser.

Linearity in the reference counting API guarantees that an aliased resource will be freed exactly once because, in turn, all aliases of a resource need to be forgotten exactly once. Moreover, linearity exacts an explicit operation to share aliases which means we can trivially increment the reference count when share is called.

Usage

The only application to date using reference-counting (this package) is ghengin. Ghengin is a (very) experimental engine where both CPU and GPU resources required to render games are tracked linearly throughout the engine (using linear IO).

When developing such a large program using linear types, a way to share linear resources and interleave their usage in non-trivial ways throughout the renderer turned out to be critical. That's where reference-counting comes in. At that point in time, I was fortunately reading TAPL2 which has a chapter on reference counting with linear types which also inspired me to write a draft version of this library.

Note that the package is still in an experimental/candidate state where there are potential unsoundness bugs in the API design. The point of this candidate release is to receive feedback on the design and its flaws -- I'm very interested if you discover it is unsound -- it has been a while since I wrote this first, and some things look suspicious to me now. OTOH the paged-in-me was mostly sure of them at the time.

Despite my uneasiness, I have to say, as a pratical statement of reliability, that it has been working wonderfully as a central point in resource management within the engine[1].

If you are looking for a considerably complex application using reference-counting, you can try to take a look at the engine source code and hope that your eyes don't bleed out.

If you are looking for a more palpable example, there are two trivial ones in test/.

[1] Seriously, after having fully rewritten the Core of the renderer in the linear IO monad with reference counting I haven't yet encountered resource bugs, whereas previously there were so many that I decided to rewrite the engine with linear types...

Design

The key datatype is Alias (opaque, tracks references dynamically under the hood).

newAlias :: (a ⊸ μ ()) ⊸ a ⊸ m (Alias μ a)

The first argument is a function to free the resource when the last reference to it is freed, and the second argument is the resource you're creating an alias for.

When you have an alias of a resource (Alias m a, where the m is the context in which the resource can be freed), you can share the alias. Sharing an alias to a resource means there are now more aliases to the resource:

share :: Alias μ a %1 -> m (Alias μ a, Alias μ a)

In fact, you can share anything which is Shareable. Aliases are trivially shareable, but other types can be as well via generic deriving, or a little bit of effort. For example, we can share a list of aliases [Alias Renderer Vk.Buffer] to get two lists with the same aliases (([Alias Renderer Vk.Buffer], [Alias Renderer Vk.Buffer])), and all nested aliases will also get shared in order to return the two lists.

When you no longer need an alias you can forget it. If this is the last alias to the resource, the finalizer action (the first argument to newAlias) will be called on the aliased value a.

forget :: Alias μ a %1 -> μ ()

Similarly to share and Shareable, forget also works for any Forgettable type. Note that for Alias, the context in which forget is called has to be the same where the finalizer action is defined (the μ parameter to Alias), since the finalizer will be called if this was the last alias.

The tricky bits

This becomes way less simple when we allow aliased values to be actually used. That is, we only care about aliases if we can refer to the original values using them. Here be dragons.

The two key ways of using the stored value are get and use/useM. useM follows a more common Haskell pattern:

useM :: Alias μ a %1 -> (a %1 -> IO (a, b)) %1 -> IO (Alias μ a, b)

Operate on the aliased a and keep it aliased plus return the b byproduct. There is also a misnamed modifyM function which is useM without the b...

Then, there's the arguably most difficult to reason about primitive, get:

get :: Alias μ a %1 -> μ (a, a %1 -> μ ())

The get primitive will un-alias a resource and get its value. By case analysis, either this was the last alias to this resource, or it was not. If it was the last alias, we need to make sure the resource is linearly freed. If it was not, we need to make sure it isn't freed since there are still aliases to it.

How do we ensure that the resource is freed if this was the last alias, and that it is not if it wasn't? The trick is to return both the resource and a finalizer function linearly -- where the finalizer is actually a no-op if there are more aliases to that same resource.

The returned function needs to be called exactly once on the resource since it is returned in a linear tuple -- it will either make sure the resource isn't consumed (because it is fake-consumed by the finaliser), or make sure it is consumed (because the finaliser will really consume the resource if this was the last alias).

This sounds like an awful idea. Maybe we could even just get by with useM which seems less troublesome? One thing that worries me is to what extent could you call the weird-finaliser on something which wasn't related to the resource returned alongside it? ...

Notes

[!CAUTION] It seems that Haddock is rendering incorrectly the lollipop symbol () which stands for linear arrow (%1 ->) in some places. This is quite unfortunate. You may have to double check the source to see if an argument is in fact linear...

type alias for alias

Typically you'll want to export a type synonym for Alias over the monad you can free resources in, import Data.Linear.Alias qualified (most functions, such as share and get are short and are arguably better to be qualified in general), such that there are no conflicts between the type synonym Alias and the library defined Alias.Alias name

import qualified Data.Linear.Alias as Alias
type Alias = Alias.Alias Linear.IO -- Choose the monad you can allocate and free this
                                   -- reference counted resource in

opaqueness for soundness?

Perhaps this is very important to explain somehow earlier on or enforce: The aliased things should be opaque. That is, you should not be able to easily modify the value which is aliased into something else. So, good aliasable values would be e.g. pointers, handles, buffers, devices, GPU textures, IORefs, etc...

The problem here is the underlying representation of Alias m a which will not use a mutable reference like IORef for the a across the aliases (I've wondered before if it should, but it never made sense for my case where the things I was aliasing were already mutable like Vk.Buffer; leaving it as a doesn't preclude the user from stuffing an IORef to alias).

So, if the a value is changeable, you could end up in a situation where you have two aliases that should hold the same value, but are different. In effect, whichever of the two aliases is forgotten last will call the finalizer action on the a value. If these values are different, the finalizer action will depend on the last forgotten resource... which is very unsound.

thread safety

We are using atomic-counter for counting, but what happens if there two actions on the same alias at the same time? It may just be that the responsibility of thread safety when two aliased resources are used at the same time is of the resource type (it seems that way).

For example, if the aliased resource was an IORef, it'd be the caller's responsibility to use atomicModifyIORef if a multi-threaded program were using multiple aliases to that same ref at once.