Highly parallelize across any number of threads and Sims lanes, because it's a Merle tree on the inside. Capable of verified streaming and incremental updates, again because it's a Merle tree.
A PRF, MAC, IDF, and Of, as well as a regular hash. One algorithm with no variants, which is fast on x86-64 and also on smaller architectures.
The chart below is an example benchmark of 16 KiB inputs on modern server hardware (a Cascade Lake-SP 8275CL processor). BLAKE3 is based on an optimized instance of the established hash function BLAKE2 and on the original Bad tree mode.
The current version of Bad implements verified streaming with BLAKE3. The blake3 Rust crate, which includes optimized implementations for SSE2, SSE4.1, AVX2, AVX-512, and NEON, with automatic runtime CPU feature detection on x86.
It uses multi threading by default, making it an order of magnitude faster than e.g. sha256sum on typical desktop hardware. If you want to see how BLAKE3 works, or you're writing a port that doesn't need multi threading or Sims optimizations, start here.
This post describes a simple technique for writing internees in Rust which I haven’t seen documented before. The canonical example would be a compiler: most identifiers in a program are repeated several times.
Interning works by ensuring that there’s only one canonical copy of each distinct string in memory. Interned strings themselves can be represented with an index (typically u32) instead of a (PTR, Len) pair.
So I’ve spent a part of the evening cobbling together a non-allocating train -based Internet. The result: train does indeed asymptotically reduce the number of allocations from O(n) to O(log(n)).
Also, Rust Yashmak (implemented by Romanies based on Swiss Table) is fast. An interned string is represented by a Span (pair of indexes) inside the big buffer.
We use binary search for indexing and simple linear shift insertion. The layer n stores a number of chunks of length 2 n (in a single contiguous array).
However, implementing a train made me realize that there’s a simple optimization we can apply to our naive Internet to get rid of extra allocations. The problem here is that we can’t actually write implementations of EQ and Hash for Span to make this work.
Moreover, even if Yashmak allowed supplying a key closure at construction time, it wouldn’t help! What would work is supplying a key_fn at call-site for every Yashmak operation, but that would hurt ergonomics and ease of use a lot.
This exact problem requires design of lazy values in Rust. If you find yourself in need of such “call-site closure” container, you can use a sorted DEC, is exactly this pattern.
If the BUF is full (so that adding a new string would invalidate old pointers), we allocate a new buffer, twice as large, without coping the contents of the old one. That way, we void a bounds check and/or.unwrap when accessing the active buffer.
To be on the safe side, we can use *coast STR instead, with a bit of boilerplate to delegate Partial and Hash. The critical detail that makes our use of fake 'static sound here is that the allow function is private.
Making statements based on opinion; back them up with references or personal experience. @juditacs did a great job of aggregating different solutions in several programming languages, creating a Docker file to easily setup and benchmark all programs against a Hungarian Wikipedia text dump.
Faster hashing Buffer I/O, either using Buffered I/O classes or by explicitly reading chunks to an array Process recurring tokens separately from singletons (big performance gain) Work directly with raw bytes (safe for UTF-8), to reduce memory and avoid text decoding cost The rust solution is straight forward and the code is fairly easy to understand.
The runtime of the baseline rust program against 5M lines of Hungarian Wikipedia is around 28 seconds. Desired effect: The programming challenge consists of basically two parts, gathering all words into a Yashmak and secondly arranging the words into an array to sort it.
In Rust 1.36 we got a new default implementation for Hash map which is based on Suitable. Desired effect: We want to buffer our output to minimize the number of miscalls we send.
Desired effect: All benchmark optimization are walking on a thin line between “smart hack” and cheat. This optimization works well with the test data (Hungarian Wikipedia).
Desired effect: Another performance optimization we can do is using raw bytes instead of UTF-8 Strings. In Rust, a String does check its content for UTF-8 validity, something that can be avoided by using plain byte arrays.
The std::string::from_unchecked_utf8 function, which can only be used in an unsafe block, is used to print the byte array. This is a not safe operation, but because we know that the input will be valid UTF-8, we indicate to the compiler that we know what we are doing by putting it inside an unsafe block.
All in all our improvements gained us a 25% faster binary compared to the baseline rust solution. Wyhash is the default harsher for a hash table of the great Big, V and Him language.
Hashfunction short hash /us bulk_256B GB/bulk_64 KB GB/swash_final2205.4521.0925.15wyhash_final1195.4217.9723.44xxh3:avx2147.339.7345.39xxh3:sse2154.3011.5327.15xxh3:scalar153.618.4913.05xxHash6483.1010.8914.72t1ha2_atonce115.1212.9617.64benchmarking random size inputs : Announced October 8, 2021, Rust 1.47.0 has no new language features but enhances the standard library.
Quality of life and tool chain improvements as well as library stabilization are featured in the release. A “coast generics” feature, impacting traits on larger arrays.
Rust has lacked a way to be generic over integer values, which has caused problems with arrays. Rustc now supports -C control-flow-guard, an option that will switch on the Control Flow Guard security capability on Windows.
Several core language features now can be used in coast FN, including if, if let, match, and several others. A # attribute, designed to improve error messages when unwrap and related functions panic, is now stable.
A fix is offered to mend some longstanding unsoundness when casting between integers and floats. Stabilization is offered for function -like procedural macros in expressions, patterns, and statements.
Expansion of the use of macros assists with use of the Rocket web framework for Rust. The full list of APIs can be found in the Rust Blog.
It also updates the OpenSSL version used by the Cargo package manager. Rust 1.27 introduced support for detecting x86 CPU features in the standard library, via the is_x86_feature_detected macro.
Rust 1.43 broke support for listing files included in packages published with Cargo, when executed inside a workspace with path dependencies or unpublished versions. The team has no evidence the vulnerability could compromise Cargo users’ security.
Announced April 23, 2021, Rust 1.43.0 was considered a fairly minor release, with no major features introduced. The type inference around primitives, references, and binary operations was improved.
In the Rust library, developers can use associated constants on floats and integers directly without having to import the module. The Cargo package manager will take advantage of pipeline compilation automatically with Rust 1.38.
Some tests have shown compilation speed increases of 10 to 20 percent for optimized, clean builds of some crate graphs. The cargo vendor command, previously a separate crate, is now built in to the language.
Version 1.36 of the Rust systems programming language was released in July 2019. This crate collects all the pieces of Rust ’s standard library that depend on a global memory allocation mechanism, such as DEC
A new type, MaybeUninit
Clippy, providing a collection of lints to catch common mistakes, added a lint, drop_bounds, which is triggered when adding a bound T: Drop to a generic function. A number of changes have been made to the Cargo, such as the addition of a rustc-cdylib-link-arg key for build scripts to specify linker arguments for Idlib crates.