The above indirection is the additional runtime cost of calling a function on a Dan Trait. The current syntax is often ambiguous and confusing, even to veterans, and favors a feature that is not more frequently used than its alternatives, is sometimes slower, and often cannot be used at all when its alternatives can.
Impl Trait is explained here. Rust praises itself as being capable of creating some of the fastest executables, while at the same time supporting high-level abstractions.
In most cases function calls are implemented using fast static dispatching. In this post I am going to present you with a thorough introduction to this concept, leading far deeper into the rabbit hole than initially anticipated.
Photo by Patryk Grades on Unsplash I will start with an example: Imagine we have some common functionality encapsulated in a Backend trait, let’s say the function compute(number: i32) i32. Now let there be two implementations of the Backend trait, a PositiveBackend and a NegativeBackend, each doing something different in the compute function.
This means, neither can we decide at runtime which one to use, nor can we have a list of different backend types. When using a trait object, we don’t care what exact type is used, we just make sure that given functionality is present.
We put our trait object, denoted by the Dan keyword, in a box so the compiler won’t complain about it not having a known size. The problem with trait objects using dynamic dispatching is, that there is a performance penalty at runtime.
Let us now examine in detail the costs of dynamic dispatching with micro benchmarks. For most of the remaining part this article we will jump back-and forth between tiny benchmarks and having a look at our executable.
For a short sanity check, I ran a debug build: The compiler actually computed the final result 0xb5e6218d1680 and hard coded it in our executable.
This is due to another optimization, inclining, meaning the compiler is writing the code as a big chunk, allowing the processor to process the instructions faster than it could by jumping around, calling functions. Not only is our code snippet here much shorter, we also have our function call (1) and match with 20_000_000 as the loop condition. Regarding the performance, our 20 million loops with function calls take about 36ms, whereas the inclined version performs 2.5 million iterations in about 3ms.
Keep in mind though, that your code might be considerably slower when it cannot be inclined, what is not caused by calling a function in rust. Since we now also can ensure the compiler isn’t optimizing away our function, we can compare static and dynamic dispatching.
Remember the visualization from above, the Rust table consists of the following: a pointer to a destruct or function (in this case drop)(8 bytes since I work on a 64bit system), and two 64bit numbers, defining the size and alignment of the following table. This totals to 24 bytes before the function pointer(s) start, exactly matching our 0×18 offset.
In my main project I have a vector of objects that all implement a different trait (Device) which I need for dynamic dispatch. My expectation is that I can take an item from the list of Device implementer and get at its Dispatch less so that I can pass it to the library.
However, if we're just processing one element at a time, we're potentially leaving behind opportunity for concurrency, which is, after all, why we're writing asynchronous code in the first place. Written by two experienced systems programmers, this book explains how Rust manages to bridge the gap between performance and safety, and how you can take advantage of it.
Title: Programming Rust Author(s): Jim Bland, Jason Sendoff Release date: December 2017 Publisher(s): O'Reilly Media, Inc. ISBN: 9781491927281 One big project we undertook was changing how we update the Member List (all those nifty people on the right side of the screen).
This has obvious benefits such as less network traffic, less CPU usage, better battery life; the list goes on and on. However, this posed one big problem on the server side: We needed a data structure capable of holding hundreds of thousands of entries, sorted in a particular way that can accept and process tons of mutations, and can report back indices of where things are being added and removed.
Elixir is a functional language; its data structures are immutable. This is great for reasoning about code and supporting the massive concurrency you enjoy when you write Elixir.
The BEAM VM is pretty speedy and getting faster every day. It tries to take advantage of persistent data structures where it can, but at the scale we operate, these large lists could not be updated fast enough.
Two engineers took up the challenge of making a pure Elixir data structure that could hold large sorted sets and support fast mutation operations. This is easier said than done, so let’s put on our Computer Science helmets and go spelunking into the caves of data structure design.
Elixir ships with a set implementation called Map Set. It’s useful for lots of Set operations, but it provides no guarantees around ordering, which is a key requirement for the Member List.
Orders are Ordered Sets, so sounds like we found the solution to our problem: Let’s break out the benchmarking to check for viability. Having exhausted all the obvious candidates that come with the language, a cursory search of packages was done to see if someone else had already solved and open sourced the solution to this problem.
A few packages were checked, but none of them provided the properties and performance required. Thankfully, the field of Computer Science has been optimizing algorithms and data structures for storing and sorting data for the last 60 years, so there were plenty of ideas about how to proceed.
If you turn your head sideways and squint real hard, this starts to look like a Skip List, which is exactly what was implemented. By leveraging compile time guards in the implementation of traversal, you can get pretty good performance in the worst case scenarios that stymie order.
Insertion of an item at the end of a 250,000 item list dropped from 27,000s to 5,000s, five times faster than raw orordersnd 34 times faster than the naive List implementation. Rust is not a functional language, and will happily let you mutate data structures.
Elixir serves this purpose very well, and lucky for us, the BEAM VM had another nifty trick up its sleeve. These are functions that are built in C or Rust and compiled into the BEAM VM.
It provides nice support on the Elixir and Rust side for making a safe NIF that is well-behaved and using the guarantees of Rust is guaranteed not to crash the VM or leak memory. This was a benchmark just using integers, but it was enough evidence to build out support for a wider range of Erlang Terms and fill out the rest of the functionality.
With the spike showing so much promise, we continued on building out support for most Erlang Terms and all the functionality we needed for the member list. The test machine churned for a few minutes and finally printed out the result: SortedSet the best case was 0.61s and worst case was 3.68s, testing multiple sizes of sets from 5,000 to 1,000,000 items.
The Rust backed NIF provides massive performance benefits without trading off ease of use or memory. Since the library operations all clocked in well under the 1 millisecond threshold, we could just use the built-in Rustler guarantees and not need to worry about reductions or yielding.
Today, the Rust backed Sorted Set powers every single Discord guild: from the 3-person guild planning a trip to Japan to 200,000 people enjoying the latest, fun game. Since deploying Sorted Set, we’ve seen performance improve across the board with no impact to memory pressure.
We can still keep our core real-time communications' logic in the higher-level Elixir with its wonderful guarantees and easy concurrency while dropping into Rust when needed. I’ve had a note in my to-do list to write down some of my own thoughts about error handling in Rust for quite some time and mostly got used to it sitting in there.
Nevertheless, a Twitter discussion brought it back to my attention since I wanted to explain them and honestly, twitter is just not the right medium for explaining design decisions, with its incredible limited space and impossible-to-follow threading model. Nevertheless, my general view on the error handling is it is mostly fine it would use some polishing, but hey, what wouldn’t.
And of course, the way I do error handling doesn’t necessarily mean it’s the way you need to be doing it too, this is very much based on personal preferences as much as some technical reasons. But I’m a strong opponent of adding more specialized syntax for error handling specifically.
(that would be a registry of asynchronous handlers of commands, each promising to eventually maybe return an u32, but being able to fail; and I probably put too few or too many > s there, sorry if you get a headache from an unclosed delimiter) Any new syntax like FN x() u32 throws Error makes the connection between this being really a Result (with useful methods on it and being able to be stored in a DEC) longer to grasp without an obvious (to me) advantage.
Furthermore, it promotes error handling into some special place in the language you no longer could write your own fully-featured Result, making std more privileged. So, if anything would be to be added to the language to help with error handling, I believe it should be of general use and in line with expressing a lot with types instead of special keywords.
Some time ago I’ve seen an idea (I believe by Without boats, but I might be mistaken) that error handling would really get better if Rust handled dynamic dispatch & down casting in some nicer way. One could also hope for some way to just list the damn errors in-line instead of having to create the whole ENIM out of band manually, but that comes with a full new can of worms (like creating unnameable types which make it harder to work with on the caller side) and this isn’t really that bad anyway.
And working with these errors is quite nice, Rust really likes ends: But let’s say we don’t really know all the ways a function can fail, either because we are lazy slackers that can’t be bothered to track it down, and we don’t really care (speed of development is a valid reason), or because somewhere in there’s a user-provided callback that can also fail for whatever reason our caller likes, so we can’t really limit them to our own preset of error types.
So let’s have something like Box< Dan Error + Send + Sync> (some people prefer to wrap that up into another type, but the high-level idea is the same). If we want to just log the error and terminate (either the application, or one request, or whatever), it’s fine.
But what if we want to check if it happens to be one of the specific error types we can somehow handle? If our cache fails to load, that sucks, but we can recover and regenerate it.
I mean, one should generally not downcast things in a perfect world, but one of the valid reasons to use Rust is because the situation is not perfect and one has to do things that generally should not be done. I’m the author of the spirit family of… let’s call it configuration manager helpers.
It takes care of loading and reloading configuration and setting up parts of an application. A lot of these errors are going to be shown to the end user, so they have to be nice and meaningful.
At that time, the failure appeared and it was the perfect tool for the job, because: Just throw in few derive and annotation attributes and you’re done (I believe the procedural macros & derives is one of the big selling points of Rust, it saves so much work).
It has a failure::Error catch-all type that handles the open use case really nicely. Eventually, when the error bubbles all the way up, I have a multi-layer error and can print something like Configuration reload failed: Couldn't load the Foo Descriptor 'XYZ.disc': No such file or directory.
(note that, unlike what failure proposes in the documentation, I prefer to output all the levels, not just the top one). All in all, I believe failure was a great success in the sense it showed a way forward.
But some code I was writing at the time targeted “bigger embedded“ somewhat limited system with a different architecture, but with a full std support, OS and stuff imagine a Raspberry Pi style device. After failure got more popular than expected and discovering that the reasons why it didn’t use the std ’s Error trait could be fixed, people started to discuss the ways forward including std -compatible failure-0.2, extending the trait in std, etc.
And I needed to move forward with my error handling I wanted to stop using failure for spirit. But I didn’t want to tie in into one specific library again, both because everything was (is) in a flux and the landscape can change and because I no longer wanted to force anything specific onto my users.
This is a type alias to Box< Dan Error + Send + Sync>, not a new opaque structure. If I decide something better appears, I can switch the thing without changing the API.
And my users can use whatever other error-handling library, because this type alias is based just on std. And if you compose your error layers in some other way, the helpers to print them to logs or format them from err-context (re-exported from spirit) will work on them too.
I’ve already mentioned I’m impatient, so I did a very minimal thing quite early. I don’t see any error-handling crate to be an obvious winner yet, so I’m still waiting before switching.
I like the split between the parts, one for the leaf types and another for gluing things together. But it also leaks through my public API and after failure, I don’t want that again, not unless there’s a clear winner in the landscape (or even better, such opaque type gets into std).
It brings all that syn and quote compile-time dependencies, but almost any bigger end-user application has some procedural macros anyway, so their cost is already paid for. The Error trait is std, but I believe it could go at least to the allow level if not directly to core.
Sometimes I discover crates that are no-std ready, but the error types can’t follow my previous point because of that. If your public API operates with Box< Dan Error>, make absolutely sure that it’s not missing the Send + Sync parts.
A lot of the catch-all error types also mandate Send + Sync and can be created from Box< Dan Error + Send + Sync>, but not from one without the markers. The syntax of.context() on results and errors is Ok, but sometimes one has to define a closure, call it directly and apply it on that just to attach the context.
There’s nothing I’d be entirely outright missing or that some form of error handling would be impossible. Also, I don’t plan to be pulled into endless discussions about error handling.