Rust ’s imperative control flow statements like return and break inside denotation also doesn’t make sense, because we do not have TCP preserving closures. “ Do notation refers to Haskell’s do notation, which is a special syntax for manipulating monads without necessitating long sequences of nested functions (we’ll see an example of this shortly).
“TCP” is an obscure initialism, but people seem to use it in this sense: a language follows “TCP” if any expression exp is equivalent to (|| exp)(). In general in Rust this is true, but it breaks down when exp contains control flow.
With more complex sequences of bind s and unit s, the difference becomes even more pronounced. This is the problem without boats is referring to when they say that return and break don’t make sense in do notation.
, but this is very much an artificial (and to the user, a seemingly-arbitrary) solution and limits the general applicability and usefulness of do notation in Rust. We’ll use a less specific example for the proposed sugaring so that it’s slightly clearer.
If we want to forbid control flow at the top level, we need some custom error handling, but it’s straightforward, technically, to implement. As a serious proposal for a do notation sugaring in Rust, this has its flaws.
Simulating the control flow in this way increases branching and it is thus likely not to have ideal performance. If we actually did want some form of do notation, we’d probably prefer to handle control flow more directly in the compiler, rather than simulating it using Contraflow.
But hopefully this demonstrates that it’s not necessary to have special handling to support monodic do in Rust and at least it’s not an issue with control flow that makes do notation implausible. The issue with control flow in do notation is not the only one that was raised (see this tweet and this one for the others), so we don’t have a full solution yet, but by tackling the difficulties one at a time we can get closer to a point at which we understand where the difficulty with these abstractions in Rust really lie.
The dot operator will perform a lot of magic to convert types. Rust do is a monodic denotation using macros and duck typing.
Some functions are provided for a number of common monodic structures. You can redistribute it and/or modify it under the terms of the Do What The Fuck You Want To Public License, Version 2, as published by Sam However.
Monads (and, more generally, constructs known as “higher minded types”) are a tool for high-level abstraction in programming languages. Historically, there has been a lot of debate inside (and outside) the Rust community about whether monads would be a useful abstraction to have in the language.
Thus, to facilitate discussion, I want to demonstrate that monads are feasible in Rust. Specifically, in this article, I’m going to describe a new approach to express monads in Rust.
This approach depends on a very minimal extension to Rust ’s type system. In particular, the design is significantly influenced by several quirks found in the Rust type system that will be obvious to those who have not already come across them.
To make this design accessible and hopefully also to explain why monads are hard(er than you might think) in Rust, I’m going to build up the definition gradually, explaining the motivation behind the choices. There are several problems with a naïve design of monads (and similar abstractions) in Rust.
Extending Rust to support full higher-kinded types raises many significant design questions. A naïve implementation of do notation would be surprisingly limited in Rust, as control-flow is captured by closures and would thus be useless in bind sugaring.
As such, any design that purports to facilitate monads must provide solutions for each of these problems. We’ll see how this works in detail below, but here’s a summary of the idea behind this design, to set the scene.
This is all a bit hand-wavy without seeing it in action, though, so we’ll leave the theory there and get into the details. The meaning should hopefully be clear even if you’re not familiar with Haskell’s syntax (just read class as trait and lowercase letters as uppercase ones).
Note on syntax I’m going to start by using imply Trait in both argument- and return-position, because I think it indicates intent more clearly. Later, I’ll demonstrate why these are insufficient, but for now I want to prioritize readability.
The problem is that we have no way to get a handle on the type constructor that we’re implementing Functor for. With a generic associated type, we can actually define a signature for map that looks like something we might expect.
In the definition above, though, the type is completely generic; there’s no reason to believe that Self and HigherS elf
In a real implementation, one may or may not want to enforce this property: having these equality relations is truer to the original concept, but in practice probably has little additional benefit. Rust has several notions of “function”, which is another reason the definition of Functor isn’t quite so straightforward as in Haskell.
(This type is unnameable in Rust, but may be printed in diagnostic messages.) So can unique closure types, as long as they do not capture variables from their environment.
Three traits for abstracting over functions: FN, Input and Nonce. The only facility we require for this design that is not currently an accepted language feature in Rust is the concept of (generic) associated traits.
In practice, we would expect Main to be instantiated to one of FN(A) B, Input(A) B or Nonce(A) B, and we could enforce this, but there’s no real need to do so. (The reasons for this are to facilitate a design pattern called external iteration, but the motivation is really tangential here.
The fact that we want Iterator to be factorial means we need to abstract over this design pattern.) While it may not look like it in our signature, imply Trait in argument-position corresponds to a generic type argument.
Now that the return type also needs to capture the mapping function, it doesn’t make so much sense to call it HigherS elf : it still corresponds in some sense to Self, but at the same time, it’s been specialized to the map function signature. I’ll call it Map for simplicity (not to be confused with the identically-named inter::Map … sorry).
Many of the difficulties in defining Monad are those we already encountered when figuring out how to implement Functor : the hardest parts are behind us. However, there is one subtlety in particular in how we define bind, which is a slightly more complex situation than map.
This time, I’m going to start with the complete definition in Rust, as it should be mostly familiar, and explain the parts that are new. The unit function is straightforwardly defined using the same reasoning as Functor::map.
(Here, there’s an additional Strait bound on Unit, but that’s not important to understand yet.) In general, though, for each function we define that uses Self in a higher-kinded fashion, we need to use a generic associated type just like for map.
This means we have to take a function that returns a monad (specifically, the current one) parameterized over B. The bind operation for the Iterator trait corresponds to the flat_map method.
We should be able to return any type that implements Iterator from the flat-mapping function we pass to flat_map. Therefore, we want to simply ensure it returns some type that implements our monad.
(If this still doesn’t click just yet, wait till you see the examples, which should make things clearer.) (This also effectively falls out of the difference between argument-position and return-position imply Trait, albeit in a slightly disguised form.
It does simplify things though (and is arguably closer to the higher-kinded viewpoint, as we’re defining everything from a higher level of abstraction). For monodic types, we can now simply use I'd for Strait and everything works as expected.
In the examples above, though the implementations of Functor and Monad for types and iterators are straightforward, they do contain a lot of boilerplate. This is made clear by observing how the entirety of the implementation definitions are determined by the map, unit and bind functions for Functor and Monad respectively.
Often types and traits implementing these higher-kinded traits already define these functions (albeit with different names): we’ve already seen this with Option and Iterator, where we simply delegate to existing functions, and it’s certainly the case for many of the functors and monads in the Rust standard library. In these cases, we could use a custom # attribute to generate the boilerplate automatically for us in the background.
The message to take away here is that, though the definitions might seem intimidating at first, in all likelihood they would rarely need to be implemented directly by the user. It would be remiss of me to talk about the feasibility of monads without even mentioning do notation.
For those of you who are unfamiliar with do notation, it provides a convenient synaptic sugar for working with monads without having to make do with deeply-nested bind s and unit s in Haskell. The main argument against the feasibility of do notation in Rust is the difficulty with composing control flow expressions (such as return or break) with closures.
This is a problem, because do notation is sugared in terms of nested bind s, which take closures as arguments. What we might expect to work intuitively in a naïve implementation of do would fail unexpectedly.
Obviously, this doesn’t work, because we’re trying to break the outer loop from inside the bind closure. I wrote a separate post on the topic a while ago, describing how we can resolve this dichotomy.
In essence, control flow and do notation are not mutually exclusive: you just need to be a little cleverer in your sugaring. The main takeaway here is that do notation in Rust could make sense: even though by necessity we’re working with closures behind the scenes, we can still recover the expressively of the notation with some sleight of hand.
All the complexity is hidden away in the definition (and to some extent, the trait implementation). Functors and monads are, in one respect, simply the first rung on the ladder of higher-kinded types.
When I spent a little time playing around with potential definitions of monad transformers, I kept running into design problems: exactly the same tricks as for monads don’t quite carry across, because we end up wanting to quantify over all traits in an implementation, rather than one trait per implementation. The main takeaway here is that, even though the designs here may not (obviously) generalize to all higher-kinded types, they generalize to a wide class of examples that could reasonably be considered sufficient for most of the abstraction users generally want to express in Rust.
We’ve defined an abstraction for functors and monads in Rust, two common higher-kinded types, using only one (conservative) hypothetical language feature: associated traits. We’ve shown how implementation boilerplate can be close to eliminated using # macros.
We’ve illustrated how control flow is compatible with a natural do notation in Rust. Many have expressed reservations about the feasibility of providing abstractions over higher-kinded types like monads in Rust (perhaps the most notable being this thread, which I address directly in the appendix for completeness).
In particular, I think monads fall naturally out of a conservative extension of the language that is beneficial even without considering the potential for higher-kinded types. IN CONCLUSION: a design that works in a pure FP which lazily evaluates and boxes everything by default doesn’t necessarily work in an eager imperative language with no runtime.
I haven’t explicitly addressed each of the points in the aforementioned “Monads and Rust thread in the main post, because I think it is self-evident that the design here is sufficient to capture Monad in Rust. However, I shall do so briefly here, to satisfy any suspicions of avoiding difficulties by omission.
Borrowing across yield in do notation : there is a problem with pretending yield is simply the bind of the Future monad (despite both having similar effects), namely when borrowing is involved. However, we can use yield just fine within do notation (using the same propagation of control flow from closures as break or return).
So yield per se isn’t a problem with do notation : we just can’t abstract it away entirely. Control flow in do notation : as illustrated above, using a straightforward sugaring, we can recover full control flow within do notation (arguably solving an open research question).
Abstracting over monads would be ergonomic in Rust : using #, implementing higher-kinded types can be made very minimal and ergonomic. In this post I’ve focused on functors and monads for Option and Iterator in particular.
For completeness, I’m going to give some additional examples, to demonstrate that the framework here really is general enough to encompass the traditional and desirable use cases. This appendix can be viewed as a reference more than a section worth reading in its own right.
It’s interderivable with bind, so I haven’t included it in the definition above, but we could provide it as a default implementation on the Monad trait. These are both plausible language extensions that also do not introduce the full complexity required by general higher-ranked types.
However, I do not dwell on them here: I think associated traits provide the most immediate benefit of these type system extensions (it’s better to focus on one feature at a time).