1. Rust Fundamentals Beginner
Why Rust Is Different: Compile-Time Safety Without GC
Rust guarantees memory safety and thread safety at compile time without requiring a garbage collector. The borrow checker enforces ownership rules statically, preventing an entire class of bugs that plague C/C++: dangling pointers, data races, use-after-free, and buffer overflows. In C/C++, these errors slip through to runtime, caught only by external tools like Valgrind or AddressSanitizer. In Rust, the compiler rejects unsafe code before it compiles.
The three pillars of Rust's safety model are:
- Ownership: Every value has exactly one owner at a time.
- Borrowing: Owners can lend references (immutable or mutable) without losing ownership.
- Lifetimes: References are tracked to ensure they never outlive their referent.
Strings: String vs &str
String is a heap-allocated, owned, growable UTF-8 string. You can mutate it, append to it, and resize it. Think of it as Vec<u8> with UTF-8 guarantees: it owns the bytes on the heap.
&str (string slice) is a borrowed reference to a contiguous sequence of UTF-8 bytes. It's always immutable and can point to three places: the heap (borrowed from a String), the binary (string literals like "hello"), or the stack. Think of it as &[u8] with UTF-8 guarantees. Use &str in function signatures for flexibility — it accepts both String and string literals via automatic deref coercion.
fn greet(name: &str) { // accepts both String and &str
println!("Hello, {name}");
}
fn main() {
let owned: String = String::from("Alice"); // heap-allocated, growable
let borrowed: &str = "Bob"; // string literal, static lifetime
greet(&owned); // auto-deref coercion: &String → &str
greet(borrowed); // already &str, no conversion needed
// Append to String (owned)
let mut s = String::from("Hello");
s.push_str(" world"); // OK: String is mutable
// Can't append to &str (immutable and no ownership)
// borrowed.push_str("!"); // ERROR
}
&str in function signatures unless you need to modify or take ownership of the string. This makes your code more flexible and reusable.
Collections: Arrays, Slices, and Vectors
Rust provides three collection types for storing sequences:
[T; N]— Fixed-size array. The size is known at compile time and fixed. Stored entirely on the stack. Example:let arr: [i32; 3] = [1, 2, 3];Vec<T>— Dynamic growable vector. Stored on the heap, with capacity managed at runtime. Can grow, shrink, push, pop. Example:let mut v: Vec<i32> = vec![1, 2, 3];&[T]— Slice (borrowed view). A borrowed window into contiguous memory — could be from an array, Vec, or even stack data. Always immutable and unsized at compile time. Functions should generally accept&[T]for maximum flexibility.
Best practice: Write functions that accept &[T] instead of &Vec<T>. This allows callers to pass arrays, vectors, or other slice sources without conversion.
// Fixed-size array (stack)
let arr: [i32; 3] = [1, 2, 3];
// arr[0] = 5; // ERROR: arrays are immutable by default
// Dynamic vector (heap)
let mut vec = vec![1, 2, 3];
vec.push(4); // OK: Vec grows
// Slice (borrowed view)
fn sum(slice: &[i32]) -> i32 {
slice.iter().sum()
}
sum(&arr); // pass array as slice
sum(&vec); // pass vec as slice
sum(&vec[1..3]); // pass partial slice
Scalar Types: The Building Blocks
Rust has four primary scalar types:
| Type | Range / Details | Size | Example |
|---|---|---|---|
| Signed integers | i8, i16, i32, i64, i128, isize | 8 bits to 128 bits | let x: i32 = -42; |
| Unsigned integers | u8, u16, u32, u64, u128, usize | 8 bits to 128 bits | let x: u8 = 255; |
| Floating-point | f32, f64 | IEEE 754 single / double precision | let x: f64 = 3.14; |
| Boolean | bool | 1 byte (true or false) | let x: bool = true; |
| Character | char | 4 bytes (any Unicode scalar value) | let x: char = 'A'; |
isize and usize are pointer-sized integers — they match your machine's pointer width (32 bits on 32-bit systems, 64 bits on 64-bit systems). Use usize for indexing, loop counts, and collection lengths.
Variable Shadowing vs Mutability
Rust offers two ways to change a binding: shadowing (rebind with let) and mutability (change with let mut). They look similar but behave differently:
- Shadowing: Creates a new binding (new variable) with the same name. The old one is destroyed. Can change the type. Useful for transforming a value through stages (e.g., string → parsed number).
- Mutability: Allows changing the value of an existing binding. Type stays the same. Useful when you're updating the same logical variable.
// === SHADOWING ===
// Read input as string, parse to number (same name, different type)
let guess = "42"; // &str
let guess: i32 = guess.parse().unwrap(); // i32 — NEW binding
// Shadowing is VERY common for transformations
let x = 5;
let x = x + 1; // OK: new binding
let x = x * 2; // OK: another new binding
// === MUTABILITY ===
// Change value of same binding (same type)
let mut y = 5;
y = 10; // OK: SAME binding, same type
// y = "hello"; // ERROR: type mismatch
// Use mut when looping or accumulating
let mut count = 0;
for i in 0..5 {
count += i;
}
let mut only when needed. This makes code intent clearer — a mutable binding says "this will change," while immutable signals stability.
2. Ownership & Borrowing Core
The Three Ownership Rules
Rust's ownership system is the foundation of memory safety. Every value in Rust has three critical properties:
- Each value has exactly one owner. No shared mutable state. The owner controls when the value is deallocated.
- When the owner goes out of scope, the value is dropped. The destructor runs (memory freed, file handles closed, locks released). No garbage collector needed.
- There can be either one mutable reference OR any number of immutable references — never both. This rule prevents data races at compile time.
Understanding Copy vs Clone
Copy and Clone are two different ways to create a duplicate. They differ fundamentally in how they work:
Copy is a marker trait that makes assignment implicitly copy the value. This is only safe for types that can be safely memcpy'd — namely, stack-only types with no heap data. When you assign a Copy type, Rust silently copies every bit, and both the old and new binding remain valid. Integers, floats, bools, chars, and tuples of Copy types are Copy.
Clone is an explicit, potentially expensive deep copy invoked with .clone(). Any type can implement Clone, even if it has heap data. Cloning is always intentional and visible in code. If a type implements Copy, it also implements Clone automatically, but not vice versa. String implements Clone but NOT Copy because it has heap-allocated data.
// Copy types — assignment copies, no move
let a = 42;
let b = a; // a is still valid (i32 is Copy)
println!("{a} {b}"); // OK! Both valid
// Non-Copy types — assignment moves ownership
let s1 = String::from("hello");
let s2 = s1; // s1 is MOVED — ownership transferred
// println!("{s1}"); // ERROR: borrow of moved value
println!("{s2}"); // OK
// Use .clone() for explicit deep copy
let s3 = String::from("world");
let s4 = s3.clone(); // deep copy — both s3 and s4 valid
println!("{s3} {s4}"); // OK! Both owned
Borrowing Rules: The Core Guarantee
Rust's borrowing rules are designed to prevent data races. A data race occurs when: (1) two or more pointers access the same data, (2) at least one writes, and (3) there's no synchronization. This leads to undefined behavior, crashes, and security vulnerabilities.
Rust statically prevents data races by enforcing one simple rule: at any point in time, you can have either one mutable reference OR any number of immutable references, but NOT both. This guarantees that no reader ever sees partially written data.
Key insight: An immutable reference gives you the contract "I won't change this data while you're reading it." A mutable reference gives you "I have exclusive access — no one else is looking."
let mut v = vec![1, 2, 3];
// Create an immutable borrow
let first = &v[0]; // first borrows immutably
// Can't create a mutable borrow while immutable exists
// v.push(4); // ERROR: cannot borrow as mutable
println!("{first}"); // last use of 'first'
// Now immutable borrow is done, so mutable borrow is OK
v.push(4); // OK: no active immutable borrows
// Multiple immutable borrows are fine
let a = &v[0];
let b = &v[1];
println!("{a} {b}"); // OK: multiple readers
Non-Lexical Lifetimes (NLL): Borrows End at Last Use
Before Rust 2018, borrows lasted until the end of the enclosing block (lexical scope). This was overly conservative and caused many "borrow checker rejects valid code" situations.
Non-Lexical Lifetimes (NLL), enabled by default since the 2018 edition, make borrows end at their last point of use, not the end of the block. This makes the borrow checker significantly more permissive and eliminates many false-positive errors. In practice, this means you can use a value again after its last borrow is done, even if that borrow is still in scope.
Common Pitfall: Returning References
A classic mistake is trying to return a reference to a local variable:
// ❌ ERROR: dangling reference
fn dangling() -> &str {
let s = String::from("hello");
&s // s is dropped here → reference would point to freed memory!
}
// ✅ Return owned value instead
fn not_dangling() -> String {
String::from("hello") // ownership transferred to caller
}
// ✅ Or return a &'static str (lives for entire program)
fn static_str() -> &'static str {
"hello" // string literal is embedded in binary
}
3. Lifetimes Advanced
What Are Lifetimes?
A lifetime is a named region of code (e.g., 'a) that the compiler uses to track how long a reference is valid. Lifetimes are how Rust prevents dangling references — the compiler ensures a reference never outlives the data it points to.
Lifetimes are mostly implicit — the compiler infers them using "elision rules." But sometimes you must annotate them explicitly, especially in function signatures with multiple input references. The compiler needs to know: "which input's lifetime should the output inherit?"
Why Lifetimes Matter: A Concrete Example
// The compiler can't infer which input's lifetime to assign to output
// Does the return value borrow from x, y, or neither?
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() { x } else { y }
}
fn main() {
let s1 = String::from("long string");
let result;
{
let s2 = String::from("xyz");
result = longest(&s1, &s2); // both alive during call
println!("{result}"); // OK
}
// result is OUT OF SCOPE here (s2 dropped)
// println!("{result}"); // ERROR: s2 dropped, result would dangle
}
Lifetime Elision Rules: When You Don't Need Annotations
The compiler applies three rules automatically to infer lifetimes. If these rules apply, you don't need to write explicit lifetime parameters:
| Rule | Description | Example |
|---|---|---|
| Rule 1 | Each input reference gets its own lifetime. | fn foo(x: &str, y: &str) becomes fn foo<'a, 'b>(x: &'a str, y: &'b str) |
| Rule 2 | If there's exactly one input lifetime, it's assigned to all outputs. | fn foo(x: &str) -> &str becomes fn foo<'a>(x: &'a str) -> &'a str |
| Rule 3 | If one input is &self or &mut self, its lifetime is assigned to all outputs (method-specific). | In impl<'a> Foo<'a> { fn bar(&self) -> &str }, output borrows from self. |
If these rules don't apply, you must annotate lifetimes explicitly.
Lifetimes in Structs: When Structs Hold References
If a struct holds a reference, that reference must outlive the struct instance. This requires a lifetime parameter on the struct:
// Struct that holds a reference — MUST have lifetime annotation
struct Excerpt<'a> {
part: &'a str, // reference must outlive the struct
}
impl<'a> Excerpt<'a> {
// Method returning a value (no borrow) — no lifetime needed in return
fn level(&self) -> i32 { 3 }
// Method returning a reference borrows from self (elision rule 3)
// Return type gets self's lifetime 'a
fn announce(&self, announcement: &str) -> &str {
println!("Attention: {announcement}");
self.part // returns reference with lifetime 'a
}
}
The 'static Lifetime
'static is a special lifetime meaning the reference is valid for the entire program duration. String literals are 'static because they're embedded in the binary.
You also see T: 'static as a trait bound, which is different: it means "T either contains no references, or all its references are 'static." It does NOT mean the value lives forever — it means the value CAN live forever if needed.
// 'static reference — valid for entire program
let s: &'static str = "I live forever";
// Trait bound: T contains no non-static references
fn requires_static<T: 'static>(val: T) {
// T can be used safely in callbacks, spawned threads, etc.
}
'a means "valid for at least as long as 'a." This is a type system guarantee, not a runtime property.
4. Type System & Traits Core
Static vs Dynamic Dispatch: impl Trait vs dyn Trait
When working with polymorphism, Rust offers two strategies with different tradeoffs:
impl Trait uses static dispatch through monomorphization. The compiler generates a specialized copy of the function for each concrete type that's used. This has zero runtime cost — no vtable lookups — but increases binary size. The compiler must know all types at compile time.
dyn Trait uses dynamic dispatch via a virtual method table (vtable). At runtime, the vtable pointer determines which function to call. This has a small overhead per call but produces smaller binaries (one version of the code). Enables runtime polymorphism — you can store different types in the same collection.
// Static dispatch — monomorphized at compile time
// Compiler generates a specialized version for each concrete type
fn print_area(shape: impl Shape) {
println!("Area: {}", shape.area());
}
// Calling with Circle and Rectangle → TWO versions generated
// Dynamic dispatch — vtable lookup at runtime
// One version of code, works with any Shape implementor
fn print_area_dyn(shape: &dyn Shape) {
println!("Area: {}", shape.area());
}
// Heterogeneous collection (only possible with dyn)
// Mix different types in the same Vec
let shapes: Vec<Box<dyn Shape>> = vec![
Box::new(Circle { radius: 5.0 }),
Box::new(Rectangle { w: 3.0, h: 4.0 }),
];
Essential Traits Every Rustacean Must Know
These traits appear everywhere in Rust code. Understanding them is critical:
| Trait | Purpose | Key Method(s) | Auto-implemented? |
|---|---|---|---|
Clone | Explicit deep copy | .clone() | No — must implement |
Copy | Implicit bitwise copy (stack-only types) | Marker trait, no methods | Yes for stack-only types |
Drop | Custom cleanup on drop (destructor) | fn drop(&mut self) | Has default (no-op) |
Display | User-friendly formatting ({}) | fn fmt(&self, ...) | No |
Debug | Debug formatting ({:?}) | fn fmt(&self, ...) | Yes with #[derive(Debug)] |
From / Into | Infallible type conversion | fn from(T) -> Self | Into auto-implemented if From exists |
Deref / DerefMut | Smart pointer dereferencing | fn deref(&self) -> &Target | No |
Iterator | Lazy sequential access | fn next(&mut self) -> Option<Item> | No |
Send | Safe to move between threads | Marker trait | Yes (mostly) |
Sync | Safe to share &T between threads | Marker trait | Yes (mostly) |
Sized | Known size at compile time | Marker trait | Yes (default bound) |
Fn / FnMut / FnOnce | Callable types (closures, functions) | fn call() -> Output | Auto for closures |
Object Safety: Making Traits into Trait Objects
Not all traits can be used as dyn Trait (trait objects). A trait is object-safe only if:
- All methods have a receiver:
&self,&mut self, orself. Methods without receivers (associated functions) don't work withdyn. - No method returns
Self: The compiler can't knowSelf's size at runtime, so returning it is impossible. - No method has generic type parameters:
fn foo<T>(...)can't be monomorphized through a vtable. - The trait doesn't require
Self: Sized: This bound prevents object safety.
// ✅ Object-safe trait
trait Draw {
fn draw(&self); // has receiver
}
// ❌ NOT object-safe (returns Self)
trait Clonable {
fn clone_self(&self) -> Self; // can't know size at runtime
}
// ❌ NOT object-safe (generic method)
trait Serializer {
fn serialize<T: serde::Serialize>(&self, val: &T); // can't monomorphize
}
5. Enums & Pattern Matching Core
Rust Enums: Algebraic Data Types (Sum Types)
Unlike C-style enums (which are just tagged integers), Rust enums are algebraic data types (also called sum types or discriminated unions). Each variant can hold different types and amounts of data. This eliminates entire categories of bugs:
- No null pointers: Use
Option<T>to represent "value or nothing" - No error codes: Use
Result<T, E>for fallible operations - No invalid states: The type system ensures you handle all variants
// Enum with different data in each variant
enum Message {
Quit, // no data
Move { x: i32, y: i32 }, // named fields (struct-like)
Write(String), // single field (tuple-like)
Color(u8, u8, u8), // multiple fields
}
// Match MUST be exhaustive — compiler forces all variants handled
fn handle(msg: Message) {
match msg {
Message::Quit => println!("Quit"),
Message::Move { x, y } => println!("Move to ({x}, {y})"),
Message::Write(text) => println!("Text: {text}"),
Message::Color(r, g, b) => println!("RGB({r}, {g}, {b})"),
} // ← compiler ensures all variants handled
}
Option<T>: Rust's Null Replacement
Option<T> represents an optional value: it's either Some(T) or None. This replaces null pointers — Rust never lets you accidentally dereference null.
// Returning Option to indicate success or failure (no exception, no null)
fn find_user(id: u64) -> Option<User> {
if id == 1 {
Some(User { name: "Alice".into() })
} else {
None
}
}
// Pattern matching to extract the value
match find_user(1) {
Some(user) => println!("Found: {}", user.name),
None => println!("Not found"),
}
// if let — concise matching for single variant
if let Some(user) = find_user(1) {
println!("Found: {}", user.name);
}
// let-else (Rust 1.65+) — bind or diverge
let Some(user) = find_user(id) else {
return Err("User not found".into());
};
Pattern Matching: Exhaustiveness & Power
Pattern matching is Rust's killer feature. The compiler forces you to handle every possible case, preventing logic errors. Patterns can destructure complex nested data in one expression:
// Nested pattern matching with guards
enum Action { Go(i32), Stop, Wait(u32) }
match action {
Action::Go(dist) if dist > 100 => println!("Far: {dist}"),
Action::Go(dist) => println!("Near: {dist}"),
Action::Stop => println!("Stop"),
Action::Wait(secs) => println!("Wait {secs}s"),
}
6. Error Handling Core
Rust's Two-Error Model: Recoverable vs Unrecoverable
Rust distinguishes between two error types:
- Recoverable errors: Expected failures like "file not found," "network timeout," "invalid input." Use
Result<T, E>to handle these. The caller can decide how to respond. - Unrecoverable errors: Bugs and invariant violations like "index out of bounds," "integer overflow in debug mode," "unwrap() on None." Use
panic!()to halt execution. These indicate programmer error.
This is different from languages with exception handling (Java, Python, C++). There's no "try-catch" in Rust — errors are explicit and typed.
Result<T, E>: Representing Failure Explicitly
Result<T, E> is an enum with two variants: Ok(T) for success and Err(E) for failure. It forces the caller to acknowledge the possibility of failure.
The ? operator (question mark) is the ergonomic way to propagate errors up the call stack. If the operation returns Ok(value), ? extracts the value. If it returns Err(e), ? immediately returns Err(e) from the current function.
use std::fs;
use std::io;
// The ? operator: propagate errors concisely
fn read_config() -> Result<String, io::Error> {
// If read_to_string fails, return Err immediately
let content = fs::read_to_string("config.toml")?;
Ok(content)
}
// Chaining operations with ?
fn process_file() -> Result<String, io::Error> {
let content = fs::read_to_string("data.txt")?;
let trimmed = content.trim().to_string();
Ok(trimmed)
}
Custom Error Types: Using thiserror and anyhow
For production code, define custom error types to represent your domain's error conditions. Two popular crates:
thiserror is designed for library code. It provides a derive macro that automatically implements std::error::Error and From impls for conversion. This makes your error type composable with other crates.
anyhow is designed for application code. It provides Result<T> (shorthand for Result<T, anyhow::Error>) with ergonomic context chaining. Less structure, more convenience.
// === thiserror (library code) ===
use thiserror::Error;
#[derive(Error, Debug)]
enum AppError {
#[error("IO error: {0}")]
Io(#[from] io::Error), // auto From impl
#[error("Parse error: {0}")]
Parse(#[from] serde_json::Error),
#[error("Not found: {0}")]
NotFound(String),
}
// === anyhow (application code) ===
use anyhow::{Context, Result};
fn load_config() -> Result<Config> {
let text = fs::read_to_string("config.toml")
.context("Failed to read config file")?; // add context
let config: Config = toml::from_str(&text)
.context("Failed to parse config")?;
Ok(config)
}
When to Use Which Error Crate
| Crate | Use For | Strengths | Downsides |
|---|---|---|---|
thiserror | Library code | Structured errors, composable, explicit variants | More boilerplate |
anyhow | Application code | Ergonomic, context chaining, easy debugging | Less type info for callers |
eyre | Application code | Like anyhow but customizable reporters (e.g., color-eyre) | Requires more setup |
miette | CLI tools | Beautiful diagnostic reports with source spans and colors | Higher overhead |
thiserror for explicit, type-safe error handling. In applications, use anyhow for ergonomics. Never let errors silently disappear with .unwrap() in production code.
7. Smart Pointers Advanced
Smart pointers are structs that behave like references but carry additional metadata and capabilities — ownership semantics, reference counting, interior mutability, or thread synchronization. They implement the Deref and Drop traits, which makes them transparent to use as references while managing resources automatically.
&T just borrows data. A smart pointer owns the data it points to and controls its lifetime. When the smart pointer goes out of scope, its Drop implementation cleans up the resource (frees heap memory, decrements a reference count, releases a lock, etc.).
7.1 Quick Reference Table
| Type | Ownership | Thread-Safe | Mutability | Use Case |
|---|---|---|---|---|
Box<T> | Single owner | Send + Sync (if T is) | Inherited | Heap allocation, recursive types, trait objects |
Rc<T> | Multiple owners | ❌ Single-threaded only | Immutable | Shared ownership in single thread |
Arc<T> | Multiple owners | ✅ Atomic refcount | Immutable | Shared ownership across threads |
Cell<T> | Single owner | ❌ | Interior mutability (Copy) | Mutate through shared reference (Copy types) |
RefCell<T> | Single owner | ❌ | Interior mutability (runtime) | Borrow-check at runtime instead of compile time |
Mutex<T> | Single owner (wrap in Arc) | ✅ | Interior mutability (locking) | Thread-safe mutable access |
RwLock<T> | Single owner (wrap in Arc) | ✅ | Multiple readers OR one writer | Read-heavy concurrent access |
Cow<'a, T> | Clone-on-write | Depends on T | Clones only when mutation needed | Avoid unnecessary cloning |
Weak<T> | Non-owning | Rc→❌ / Arc→✅ | Immutable | Break reference cycles, caches |
7.2 The Foundation: Deref and Drop
Every smart pointer implements two key traits that make it feel like a native reference:
// Deref — allows *ptr and auto-dereferencing (ptr.method())
pub trait Deref {
type Target: ?Sized;
fn deref(&self) -> &Self::Target;
}
// DerefMut — allows mutable dereference
pub trait DerefMut: Deref {
fn deref_mut(&mut self) -> &mut Self::Target;
}
// Drop — custom cleanup when value goes out of scope
pub trait Drop {
fn drop(&mut self);
}
&Box<String> to a function expecting &str. The compiler automatically chains: &Box<String> → &String → &str. This works transitively through any number of Deref implementations.
fn greet(name: &str) {
println!("Hello, {name}!");
}
let boxed: Box<String> = Box::new(String::from("Rust"));
greet(&boxed); // Deref coercion: &Box<String> → &String → &str
// Drop order: variables drop in REVERSE declaration order
{
let a = Box::new("first"); // dropped second
let b = Box::new("second"); // dropped first
} // b.drop() then a.drop()
// Early drop with std::mem::drop()
let lock = mutex.lock().unwrap();
// ... use lock ...
drop(lock); // release lock early, before scope ends
7.3 Box<T> — Heap Allocation
Box<T> is the simplest smart pointer: it allocates T on the heap and owns it exclusively. When the Box is dropped, the heap memory is freed. Zero runtime overhead beyond the allocation itself — a Box<T> is just a pointer (one usize wide).
When to use Box
// 1. Recursive types — Box breaks infinite size
enum List {
Cons(i32, Box<List>), // without Box: compiler error "infinite size"
Nil,
}
let list = List::Cons(1,
Box::new(List::Cons(2,
Box::new(List::Cons(3,
Box::new(List::Nil))))));
// 2. Trait objects — dynamic dispatch
trait Animal {
fn speak(&self) -> &str;
}
struct Dog;
impl Animal for Dog {
fn speak(&self) -> &str { "Woof!" }
}
let pet: Box<dyn Animal> = Box::new(Dog);
println!("{}", pet.speak()); // dynamic dispatch via vtable
// Box<dyn Trait> is a "fat pointer" — two usizes:
// ptr to data | ptr to vtable
// 3. Large values — avoid stack overflow on huge types
let big = Box::new([0u8; 1_000_000]); // 1MB on heap, not stack
// 4. Transfer ownership without copying
fn process(data: Box<[u8; 1_000_000]>) {
// only the pointer (8 bytes) is moved, not 1MB of data
}
// 5. Box::leak — intentionally leak for 'static lifetime
let s: &'static str = Box::leak(Box::new(String::from("forever")));
// useful for lazy-initialized global strings
7.4 Rc<T> — Reference Counting (Single-Thread)
Rc<T> enables multiple owners of the same heap-allocated value. Each Rc::clone() increments a reference count; when the last Rc is dropped, the value is deallocated. This is not thread-safe — the counters use non-atomic operations.
use std::rc::Rc;
let a = Rc::new(String::from("shared data"));
println!("count after a: {}", Rc::strong_count(&a)); // 1
let b = Rc::clone(&a); // cheap! just increments counter
println!("count after b: {}", Rc::strong_count(&a)); // 2
{
let c = Rc::clone(&a);
println!("count in block: {}", Rc::strong_count(&a)); // 3
} // c dropped → count = 2
println!("count after block: {}", Rc::strong_count(&a)); // 2
drop(b); // count = 1
// when `a` drops → count = 0 → String is deallocated
Rc::clone() vs .clone() — always write Rc::clone(&x) instead of x.clone(). Both do the same thing, but the convention makes it visually clear you're just incrementing a counter (cheap), not deep-cloning the inner value (potentially expensive).
7.5 Weak<T> — Breaking Reference Cycles
Rc can create reference cycles that prevent memory from ever being freed (memory leak). Weak<T> holds a non-owning reference that doesn't increment the strong count. You must upgrade() a Weak to an Option<Rc<T>> before using it — it returns None if the value was already dropped.
use std::rc::{Rc, Weak};
use std::cell::RefCell;
// Tree with parent back-references (classic cycle problem)
struct Node {
value: i32,
parent: RefCell<Weak<Node>>, // Weak! prevents cycle
children: RefCell<Vec<Rc<Node>>>, // Rc — parent owns children
}
let leaf = Rc::new(Node {
value: 3,
parent: RefCell::new(Weak::new()),
children: RefCell::new(vec![]),
});
let branch = Rc::new(Node {
value: 5,
parent: RefCell::new(Weak::new()),
children: RefCell::new(vec![Rc::clone(&leaf)]),
});
// Set parent — using Weak to avoid cycle
*leaf.parent.borrow_mut() = Rc::downgrade(&branch);
// Access parent via upgrade()
match leaf.parent.borrow().upgrade() {
Some(parent) => println!("leaf's parent = {}", parent.value),
None => println!("parent was dropped"),
}
println!("branch strong={}, weak={}",
Rc::strong_count(&branch), // 1
Rc::weak_count(&branch), // 1 (from leaf's parent)
);
7.6 Cell<T> and RefCell<T> — Interior Mutability
Rust's ownership rules normally require &mut for mutation. Interior mutability lets you mutate data through a shared & reference, moving the borrow check from compile time to runtime (or avoiding it for Copy types).
Cell<T> — For Copy types (zero overhead)
use std::cell::Cell;
let x = Cell::new(42); // no mut needed!
x.set(99); // mutate through shared reference
let val = x.get(); // returns a COPY of the value (T: Copy required)
println!("{val}"); // 99
// Common pattern: counter inside shared struct
struct Stats {
calls: Cell<u64>, // can increment without &mut Stats
}
impl Stats {
fn record_call(&self) { // note: &self, not &mut self
self.calls.set(self.calls.get() + 1);
}
}
// Cell has zero runtime cost — no flags, no borrowing state
// But: only works with Copy types, no references to interior
RefCell<T> — Runtime borrow checking for any T
use std::cell::RefCell;
let data = RefCell::new(vec![1, 2, 3]);
// borrow() → Ref<T> (shared, like &T)
{
let r1 = data.borrow();
let r2 = data.borrow(); // multiple shared borrows OK
println!("{:?} {:?}", r1, r2);
} // r1 and r2 dropped — borrows released
// borrow_mut() → RefMut<T> (exclusive, like &mut T)
{
let mut w = data.borrow_mut();
w.push(4); // can mutate through shared &RefCell
}
// PANICS at runtime if rules violated!
let r = data.borrow();
// let w = data.borrow_mut(); // PANIC! already borrowed as shared
// try_borrow() / try_borrow_mut() — return Result instead of panicking
match data.try_borrow_mut() {
Ok(mut_ref) => mut_ref.push(5),
Err(_) => eprintln!("already borrowed!"),
}
Cell<T> when T: Copy (integers, bools, small structs) — it's zero-cost. Use RefCell<T> when you need references to the inner value or T isn't Copy. RefCell has a small runtime cost (tracks borrow state with a counter) and panics on violation.
7.7 Rc<RefCell<T>> — Shared Mutable State (Single-Thread)
The canonical single-threaded pattern for multiple owners who all need to mutate the same data. Rc provides shared ownership, RefCell provides interior mutability.
use std::rc::Rc;
use std::cell::RefCell;
// Shared mutable list — multiple owners can push/pop
let shared = Rc::new(RefCell::new(vec![1, 2, 3]));
let clone1 = Rc::clone(&shared);
let clone2 = Rc::clone(&shared);
clone1.borrow_mut().push(4);
clone2.borrow_mut().push(5);
println!("{:?}", shared.borrow()); // [1, 2, 3, 4, 5]
// Observer pattern with Rc<RefCell<T>>
type Callback = Box<dyn Fn(i32)>;
struct EventEmitter {
listeners: Vec<Rc<RefCell<Callback>>>,
}
impl EventEmitter {
fn emit(&self, value: i32) {
for listener in &self.listeners {
(listener.borrow())(value);
}
}
}
7.8 Arc<T> — Atomic Reference Counting (Multi-Thread)
Arc<T> is the thread-safe version of Rc<T>. It uses atomic operations for the reference count, making it safe to share across threads. The cost: atomic operations are slightly more expensive than non-atomic ones.
use std::sync::Arc;
use std::thread;
let data = Arc::new(vec![1, 2, 3, 4, 5]);
let handles: Vec<_> = (0..3).map(|i| {
let data = Arc::clone(&data); // increment atomic counter
thread::spawn(move || {
let sum: i32 = data.iter().sum();
println!("Thread {i}: sum = {sum}");
})
}).collect();
for h in handles { h.join().unwrap(); }
// Arc::try_unwrap — recover owned T if you're the last reference
let sole_owner = Arc::new(String::from("mine"));
match Arc::try_unwrap(sole_owner) {
Ok(s) => println!("Got it: {s}"), // only if strong_count == 1
Err(arc) => println!("Still shared"),
}
Arc<T> only provides shared ownership — not mutable access. To mutate, combine with a synchronization primitive: Arc<Mutex<T>> or Arc<RwLock<T>>.
7.9 Arc<Mutex<T>> and Arc<RwLock<T>> — Thread-Safe Mutable State
// Arc + Mutex = thread-safe shared mutable state
use std::sync::{Arc, Mutex};
let counter = Arc::new(Mutex::new(0));
let handles: Vec<_> = (0..10).map(|_| {
let counter = Arc::clone(&counter);
std::thread::spawn(move || {
let mut num = counter.lock().unwrap();
*num += 1;
// MutexGuard dropped here → lock released
})
}).collect();
for h in handles { h.join().unwrap(); }
println!("Result: {}", *counter.lock().unwrap()); // 10
// Mutex poisoning — if a thread panics while holding the lock,
// the Mutex becomes "poisoned" and lock() returns Err
let data = Arc::new(Mutex::new(vec![1]));
let d = Arc::clone(&data);
let _ = std::thread::spawn(move || {
let mut v = d.lock().unwrap();
v.push(2);
panic!("oops"); // lock is poisoned!
}).join();
match data.lock() {
Ok(guard) => println!("data: {:?}", *guard),
Err(poisoned) => {
// recover the data anyway:
let guard = poisoned.into_inner();
println!("recovered: {:?}", *guard); // [1, 2]
}
}
// Arc + RwLock = multiple readers OR one writer
use std::sync::{Arc, RwLock};
let config = Arc::new(RwLock::new(String::from("v1")));
// Multiple readers simultaneously
{
let r1 = config.read().unwrap(); // shared read lock
let r2 = config.read().unwrap(); // another shared read — OK!
println!("r1={r1}, r2={r2}");
} // read locks dropped
// One writer (blocks until all readers release)
{
let mut w = config.write().unwrap(); // exclusive write lock
*w = String::from("v2");
}
// When to use RwLock vs Mutex:
// RwLock → read-heavy workloads (many readers, rare writes)
// Mutex → write-heavy workloads or when critical sections are short
// RwLock has higher overhead per lock/unlock due to reader tracking
7.10 Cow<'a, T> — Clone on Write
Cow (Clone on Write) is an enum that can hold either a borrowed reference or an owned value. It only clones when you actually need to mutate. This avoids unnecessary allocations in the common "read-only" path.
use std::borrow::Cow;
// Definition:
// enum Cow<'a, B: ToOwned + ?Sized> {
// Borrowed(&'a B),
// Owned(<B as ToOwned>::Owned),
// }
// Function that may or may not need to modify a string
fn maybe_uppercase(s: &str, shout: bool) -> Cow<str> {
if shout {
Cow::Owned(s.to_uppercase()) // allocates only when needed
} else {
Cow::Borrowed(s) // zero cost — just a reference
}
}
let quiet = maybe_uppercase("hello", false); // no allocation
let loud = maybe_uppercase("hello", true); // allocates "HELLO"
// to_mut() — clones only if currently Borrowed
let mut s: Cow<str> = Cow::Borrowed("hello");
s.to_mut().push_str(" world"); // now clones "hello" into owned String
println!("{s}"); // "hello world"
// Common use case: processing strings that usually don't need changes
fn normalize_path(path: &str) -> Cow<str> {
if path.contains("\\\\") {
Cow::Owned(path.replace("\\\\", "/"))
} else {
Cow::Borrowed(path) // most paths already use /, no allocation
}
}
// Cow with Vec
fn ensure_sorted(data: &[i32]) -> Cow<[i32]> {
if data.windows(2).all(|w| w[0] <= w[1]) {
Cow::Borrowed(data) // already sorted
} else {
let mut v = data.to_vec();
v.sort();
Cow::Owned(v)
}
}
7.11 Building a Custom Smart Pointer
Understanding how to implement Deref and Drop demystifies what smart pointers actually do. Here's a minimal wrapper that tracks how many times its value is accessed:
use std::ops::Deref;
use std::cell::Cell;
struct Tracked<T> {
value: T,
access_count: Cell<u32>,
}
impl<T> Tracked<T> {
fn new(value: T) -> Self {
Tracked { value, access_count: Cell::new(0) }
}
fn access_count(&self) -> u32 {
self.access_count.get()
}
}
impl<T> Deref for Tracked<T> {
type Target = T;
fn deref(&self) -> &T {
self.access_count.set(self.access_count.get() + 1);
&self.value
}
}
impl<T: std::fmt::Debug> Drop for Tracked<T> {
fn drop(&mut self) {
println!(
"Dropping Tracked({:?}), accessed {} times",
self.value, self.access_count.get()
);
}
}
let t = Tracked::new(String::from("hello"));
println!("len = {}", t.len()); // Deref → &String → .len()
println!("upper = {}", t.to_uppercase());
println!("accessed {} times", t.access_count()); // 2
// drop: "Dropping Tracked("hello"), accessed 2 times"
7.12 Which Smart Pointer Do I Need?
Box — one allocation, zero runtime overhead after. Rc — cheap clone (increment counter). Arc — slightly more expensive clone (atomic increment). Cell — zero overhead. RefCell — small overhead (borrow tracking). Mutex — OS-level lock (heavier). RwLock — heavier than Mutex per operation but allows concurrent reads. Cow — zero cost on the borrow path.
7.13 Real-World Use Cases & Production Patterns
Box<T> — When and Why
// ── Use Case 1: AST / Compiler — recursive tree structures ──
// Without Box, the compiler can't compute the size of Expr
enum Expr {
Literal(f64),
Add(Box<Expr>, Box<Expr>), // recursive!
Mul(Box<Expr>, Box<Expr>),
Negate(Box<Expr>),
FnCall { name: String, args: Vec<Expr> },
}
// Real-world: rustc, syn crate, any parser, SQL query planners
// ── Use Case 2: Plugin / Strategy pattern — dyn Trait dispatch ──
trait Compressor: Send + Sync {
fn compress(&self, data: &[u8]) -> Vec<u8>;
fn name(&self) -> &str;
}
struct Pipeline {
compressor: Box<dyn Compressor>, // chosen at runtime
// could be GzipCompressor, ZstdCompressor, LZ4Compressor...
}
// Real-world: web servers (middleware), serializers, storage engines
// ── Use Case 3: Avoid large stack moves ──
struct ImageBuffer {
pixels: [u8; 4096 * 4096 * 4], // 64MB! stack overflow risk
}
let img = Box::new(ImageBuffer { pixels: [0; 4096 * 4096 * 4] });
// Only 8 bytes moved around (the pointer), not 64MB
// ── Use Case 4: Returning closures from functions ──
fn make_greeter(greeting: String) -> Box<dyn Fn(&str) -> String> {
Box::new(move |name| format!("{greeting}, {name}!"))
}
// Each closure has a unique anonymous type → Box erases it
// ── Use Case 5: Box::leak for global/static data ──
fn init_config() -> &'static Config {
let config = load_config_from_file();
Box::leak(Box::new(config)) // lives forever, never freed
}
// Real-world: lazy_static alternative, global singletons, string interning
Rc<T> — When and Why
// ── Use Case 1: Graph / DAG — nodes with multiple parents ──
use std::rc::Rc;
struct DagNode {
value: String,
children: Vec<Rc<DagNode>>, // child can have multiple parents
}
// Without Rc you'd need to pick ONE owner → can't represent DAGs
// Real-world: dependency graphs, build systems, type inference
// ── Use Case 2: Undo/Redo — persistent data structures ──
// Each undo state shares most data with the previous state
struct EditorState {
text: Rc<String>, // shared if unchanged between versions
cursor_pos: usize,
selection: Option<(usize, usize)>,
}
struct UndoStack {
history: Vec<EditorState>, // states share Rc'd data → cheap snapshots
}
// Real-world: text editors, database MVCC, immutable data structures (im crate)
// ── Use Case 3: GUI widget trees — parent shares child refs ──
struct Widget {
label: String,
children: Vec<Rc<Widget>>,
}
// A layout system and a renderer both hold Rc to the same widget tree
// Real-world: druid, iced (GUI frameworks)
Arc<T> / Arc<Mutex<T>> / Arc<RwLock<T>> — When and Why
// ── Use Case 1: Shared app state in a web server ──
use std::sync::{Arc, RwLock};
struct AppState {
db_pool: PgPool, // immutable after init
rate_limiter: Arc<RwLock<RateLimiter>>, // mutable, read-heavy
cache: Arc<RwLock<HashMap<String, CachedResponse>>>,
}
// Axum shares state across handler threads via Arc
// Real-world: every Axum/Actix/Warp web server
// ── Use Case 2: Background task + main thread sharing data ──
use std::sync::{Arc, Mutex};
struct ProgressTracker {
completed: usize,
total: usize,
errors: Vec<String>,
}
let progress = Arc::new(Mutex::new(ProgressTracker {
completed: 0, total: 1000, errors: vec![]
}));
// Worker thread updates progress
let p = Arc::clone(&progress);
std::thread::spawn(move || {
for i in 0..1000 {
// ... do work ...
p.lock().unwrap().completed += 1;
}
});
// Main thread reads progress for UI updates
let snap = progress.lock().unwrap();
println!("{}% done", snap.completed * 100 / snap.total);
// ── Use Case 3: Shared immutable config (no lock needed!) ──
let config = Arc::new(Config::load()); // immutable → no Mutex needed
// Arc<T> implements Deref, so all threads can read config.max_retries
// directly. Arc only needs Mutex if you need to MUTATE.
// Real-world: feature flags, DB connection strings, TLS certs
// ── Use Case 4: Connection pool — Arc without Mutex ──
// Database pools (sqlx, deadpool) internally handle synchronisation
// You just wrap the pool in Arc for sharing across tasks
// let pool = Arc::new(PgPool::connect("postgres://...").await?);
// Each Tokio task gets Arc::clone(&pool) — pool manages its own locks
Cow<'a, T> — Real Production Patterns
// ── Use Case 1: URL encoding — most strings pass through unchanged ──
use std::borrow::Cow;
fn url_encode(input: &str) -> Cow<str> {
if input.chars().all(|c| c.is_alphanumeric() || c == '-' || c == '_') {
Cow::Borrowed(input) // 90% of URLs — zero allocation
} else {
Cow::Owned(do_percent_encode(input)) // only when needed
}
}
// Real-world: percent-encoding crate, serde deserialization, HTML escaping
// ── Use Case 2: Config defaults with optional overrides ──
fn get_db_url(env_override: Option<&str>) -> Cow<str> {
match env_override {
Some(url) => Cow::Borrowed(url),
None => Cow::Owned(format!("postgres://localhost:5432/mydb")),
}
}
// ── Use Case 3: Text processing pipeline ──
fn normalize_whitespace(s: &str) -> Cow<str> {
if s.contains(" ") || s.contains('\t') {
Cow::Owned(s.split_whitespace().collect::<Vec<_>>().join(" "))
} else {
Cow::Borrowed(s)
}
}
// Chain: input → normalize_whitespace → url_encode → output
// If input is clean, ZERO allocations through the entire pipeline!
// ── Use Case 4: API response — avoid cloning when not needed ──
struct ApiResponse<'a> {
status: u16,
body: Cow<'a, str>, // borrowed from cache OR owned from computation
}
// Cache hit → Cow::Borrowed (zero-copy from cache)
// Cache miss → Cow::Owned (freshly computed)
Weak<T> — Beyond Cycles
// ── Use Case 1: LRU Cache — weak references for eviction ──
use std::sync::{Arc, Weak};
use std::collections::HashMap;
struct WeakCache<K, V> {
entries: HashMap<K, Weak<V>>, // doesn't keep values alive
}
impl<K: Eq + Hash, V> WeakCache<K, V> {
fn get(&self, key: &K) -> Option<Arc<V>> {
self.entries.get(key)?.upgrade() // None if evicted
}
}
// Real-world: asset caches (textures, fonts), connection pools, observers
// ── Use Case 2: Observer pattern — observers don't keep subject alive ──
struct Subject {
observers: Vec<Weak<dyn Observer>>, // won't prevent cleanup
}
impl Subject {
fn notify(&self) {
for weak_obs in &self.observers {
if let Some(obs) = weak_obs.upgrade() {
obs.on_update();
}
// Dead observers silently skipped — no dangling pointers!
}
}
fn cleanup_dead(&mut self) {
self.observers.retain(|w| w.strong_count() > 0);
}
}
// Real-world: pub/sub systems, UI bindings, game entity systems
Cell<T> / RefCell<T> — Beyond the Basics
// ── Use Case 1: Memoization / lazy field in an immutable struct ──
use std::cell::OnceCell;
struct CompiledRegex; // placeholder
struct MyRegex {
pattern: String,
compiled: OnceCell<CompiledRegex>, // lazily initialized
}
impl MyRegex {
fn compiled(&self) -> &CompiledRegex {
self.compiled.get_or_init(|| compile(&self.pattern))
}
}
// &self method, yet initializes an internal field on first call
// Real-world: regex crate, cached computations, lazy formatting
// ── Use Case 2: Visited set inside a recursive traversal ──
use std::cell::RefCell;
use std::collections::HashSet;
struct Graph {
nodes: Vec<Vec<usize>>,
}
impl Graph {
fn dfs(&self, start: usize) -> Vec<usize> {
let visited = RefCell::new(HashSet::new());
let result = RefCell::new(Vec::new());
self.dfs_inner(start, &visited, &result);
result.into_inner()
}
fn dfs_inner(&self, node: usize,
visited: &RefCell<HashSet<usize>>,
result: &RefCell<Vec<usize>>) {
if !visited.borrow_mut().insert(node) { return; }
result.borrow_mut().push(node);
for &neighbor in &self.nodes[node] {
self.dfs_inner(neighbor, visited, result);
}
}
}
// RefCell lets &self methods mutate internal tracking state
// Real-world: graph algorithms, tree walkers, interpreters
// ── Use Case 3: Cell for counters / flags without &mut ──
use std::cell::Cell;
struct ConnectionPool {
max_size: usize,
active: Cell<usize>, // track without &mut self
total_requests: Cell<u64>, // statistics counter
}
impl ConnectionPool {
fn acquire(&self) -> bool {
self.total_requests.set(self.total_requests.get() + 1);
if self.active.get() < self.max_size {
self.active.set(self.active.get() + 1);
true
} else {
false
}
}
}
// Real-world: metrics, debug counters, feature flag tracking
Smart Pointer Anti-Patterns
// ❌ Anti-pattern 1: Using Rc when ownership is clear
// BAD: fn process(data: Rc<Vec<u8>>) { ... }
// GOOD: fn process(data: &[u8]) { ... } // just borrow!
// Only use Rc when you genuinely need SHARED OWNERSHIP
// ❌ Anti-pattern 2: Arc<Mutex<T>> for read-only data
// BAD: Arc<Mutex<Config>> when config never changes after init
// GOOD: Arc<Config> — no lock needed for immutable data!
// ❌ Anti-pattern 3: RefCell everywhere to "shut up the borrow checker"
// BAD: wrapping everything in RefCell to avoid thinking about ownership
// GOOD: redesign your data flow first. RefCell is a targeted escape hatch,
// not a default choice.
// ❌ Anti-pattern 4: Box for small Copy types
// BAD: Box<i32> — heap allocation for 4 bytes is wasteful
// GOOD: just use i32 on the stack
// ❌ Anti-pattern 5: Rc<RefCell<T>> in multi-threaded code
// This won't even compile — Rc is !Send
// GOOD: Arc<Mutex<T>> or Arc<RwLock<T>>
// ✅ The golden rule: use the SIMPLEST pointer that satisfies your needs
// Stack value > &T > Box<T> > Rc<T> > Arc<T> > Arc<Mutex<T>>
// Left = faster, simpler. Right = more capability, more overhead.
8. Concurrency Advanced
Send and Sync Traits
Send and Sync are marker traits that describe thread-safety guarantees. They are automatically implemented (or not) by the compiler based on your type's fields.
Send: A type isSendif it can be safely moved between threads. When you transfer ownership to another thread, the original thread loses the value. Examples:Box<T>,Vec<T>, primitives. Counter-examples:Rc<T>(reference count not atomic), raw pointers.Sync: A type isSyncif it can be safely shared between threads via immutable references. IfTisSync, then&Tcan be sent to another thread. A rule of thumb:&TisSendiffTisSync. Examples: most primitives,Arc<T>. Counter-examples:Cell<T>,RefCell<T>(interior mutability without locking).
| Type | Send? | Sync? | Notes |
|---|---|---|---|
Primitives (i32, bool, etc.) | ✅ | ✅ | Copy types, no pointers |
Box<T> | If T is Send | If T is Sync | Unique ownership |
Vec<T> | If T is Send | If T is Sync | Heap-allocated buffer |
Rc<T> | ❌ | ❌ | Non-atomic refcount |
Arc<T> | If T is Send | If T is Sync | Atomic refcount |
Cell<T> | If T is Send | ❌ | Interior mutability without sync |
RefCell<T> | If T is Send | ❌ | Runtime borrow checks |
Mutex<T> | If T is Send | ✅ | Thread-safe locking |
RwLock<T> | If T is Send | ✅ | Multiple readers or one writer |
*const T, *mut T | ❌ | ❌ | Raw pointers (unsafe, compiler can't verify) |
Manual implementations: You can manually implement Send or Sync for your types using unsafe impl, but you take responsibility for the safety invariants.
Threading Primitives
Spawning threads: Use std::thread::spawn() to create a new OS thread. The closure must be FnOnce (moves all captured values). You can join threads to wait for completion.
Message passing with channels (MPSC): The std::sync::mpsc module provides multi-producer, single-consumer channels. The sender (Sender<T>) can be cloned across threads; the receiver (Receiver<T>) is single-threaded. Sending moves the message, so no shared state is needed.
Scoped threads (Rust 1.63+): std::thread::scope() allows child threads to borrow from the parent's stack. The scope guarantees all threads finish before returning, eliminating the need for Arc in many cases.
use std::sync::mpsc;
use std::thread;
// Message passing with channels
let (tx, rx) = mpsc::channel();
let producer = thread::spawn(move || {
for i in 0..5 {
tx.send(format!("msg {i}")).unwrap();
}
});
// Receive all messages (channel closed when tx is dropped)
for msg in rx {
println!("Received: {msg}");
}
producer.join().unwrap(); // wait for sender thread
// Scoped threads (Rust 1.63+) — borrow from parent stack
let mut data = vec![1, 2, 3];
std::thread::scope(|s| {
s.spawn(|| {
println!("Read: {:?}", &data); // borrow data — no Arc needed!
});
s.spawn(|| {
data.push(4); // mutable access (only one writer)
});
}); // all threads guaranteed to finish before this line
Data Parallelism with Rayon
Rayon provides a high-level API for data parallelism. Instead of manually spawning threads, you convert iterators to parallel iterators with .into_par_iter() or .par_iter(), and Rayon distributes the work across a thread pool. This is zero-copy and works with closures that are Fn (immutable borrows).
use rayon::prelude::*;
// Parallel iteration — Rayon handles thread pool
let sum: i64 = (0..1_000_000)
.into_par_iter() // convert to parallel iterator
.map(|x| x * x) // each thread maps its chunk
.sum(); // Rayon combines results
// Parallel sorting (built-in for Rayon)
let mut vec = vec![5, 2, 8, 1];
vec.par_sort_unstable(); // O(n log n) in parallel
// Filter + map in parallel
let result: Vec<_> = (0..1000)
.into_par_iter()
.filter(|x| x % 2 == 0)
.map(|x| x * x)
.collect();
9. Async / Await Advanced
Futures and the Executor Model
An async fn returns a Future — an object that implements the Future trait and represents a value that will be available later (possibly never, or after an error). Futures are lazy: defining an async function does not execute it. Instead, you must .await it or pass it to an executor.
When you .await a future, you suspend the current task. The executor then polls other futures while waiting for I/O or timers. This is zero-cost abstraction — no threads are spawned; the compiler generates a state machine that transitions through states as operations complete.
Key difference from JavaScript: Rust is not callback-based. Rust futures are compiled to enums with states, and .await yields control back to the executor's event loop. The runtime polls your future repeatedly (via the poll method) until it returns Poll::Ready(value).
Async State Machine Compilation — In Depth
This is the core insight of Rust's async model. When you write an async fn, the compiler transforms it into a state machine struct that implements the Future trait. Each .await point becomes a state transition. No threads are created — the entire execution is driven by a polling loop.
Step 1: Your async code
async fn fetch_data(client: Client, url: String) -> String {
let resp = client.get(&url).await; // ← .await point #1
let body = resp.text().await; // ← .await point #2
body
}
Step 2: What the compiler generates (conceptual)
The compiler creates an enum with one variant per state. Each variant stores exactly the local variables that are alive at that suspension point — nothing more. This is why async Rust is zero-cost: the state machine is sized at compile time with no heap allocation for the future itself.
// The compiler turns your async fn into something like this:
enum FetchDataFuture {
// Before first .await — holds the function arguments
Start { client: Client, url: String },
// Waiting on client.get() — holds the sub-future + data needed later
WaitingForResponse {
get_future: ClientGetFuture, // the inner future being polled
},
// Waiting on resp.text() — holds the sub-future
WaitingForBody {
text_future: ResponseTextFuture,
},
// Terminal state — already returned a value
Done,
}
Step 3: The generated Future impl
The poll() method is a state machine dispatcher. Each call to poll() resumes where the last one left off, like a manually-written coroutine.
impl Future for FetchDataFuture {
type Output = String;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<String> {
loop {
match self {
Start { client, url } => {
// Create the inner future for client.get()
let get_future = client.get(url);
// Transition to next state
*self = WaitingForResponse { get_future };
// Fall through to poll the inner future immediately
}
WaitingForResponse { get_future } => {
match get_future.poll(cx) {
Poll::Ready(resp) => {
// Inner future resolved! Create the next one.
let text_future = resp.text();
*self = WaitingForBody { text_future };
// Fall through to poll text_future immediately
}
Poll::Pending => {
// I/O not ready yet — suspend.
// The Waker in cx will notify the executor
// when data arrives, triggering another poll().
return Poll::Pending;
}
}
}
WaitingForBody { text_future } => {
match text_future.poll(cx) {
Poll::Ready(body) => {
*self = Done;
return Poll::Ready(body); // ✅ final value!
}
Poll::Pending => return Poll::Pending,
}
}
Done => panic!("polled after completion"),
}
}
}
}
The Polling Lifecycle
The Waker and Context
The Context passed to poll() contains a Waker — a handle that the inner future uses to tell the executor "wake me up when I'm ready to make progress." Here's how it works end-to-end:
// The Waker flow:
// 1. Executor calls poll(cx) on your future
// 2. Your future calls inner_future.poll(cx), passing the Waker along
// 3. The leaf future (e.g. TCP socket read) registers the Waker
// with the OS event system (epoll/kqueue/IOCP)
// 4. Returns Poll::Pending — executor parks this task
// 5. When the OS says "data is ready", the reactor calls waker.wake()
// 6. Executor re-queues the task and calls poll(cx) again
// 7. This time inner_future.poll(cx) returns Poll::Ready(data)
// Manual Future impl showing Waker usage:
struct Delay {
when: Instant,
waker_registered: bool,
}
impl Future for Delay {
type Output = ();
fn poll(mut self: Pin<&mut Self>, cx: &mut Context) -> Poll<()> {
if Instant::now() >= self.when {
Poll::Ready(()) // timer expired
} else {
// Clone the waker and register it with a timer thread
let waker = cx.waker().clone();
let when = self.when;
std::thread::spawn(move || {
std::thread::sleep(when - Instant::now());
waker.wake(); // ← tells executor to poll us again
});
Poll::Pending
}
}
}
Why This Is Zero-Cost
- Futures are lazy — the state machine is in the
Startstate until first polled. Nothing happens until the executor drives it. - Futures compose by nesting — when you
.awaitinside anasync fn, the inner future is stored as a field in the outer enum. The whole tree of futures is one big nested struct. - Cancellation = drop — dropping the state machine enum drops all its fields, cleaning up all inner futures. No special cancellation protocol needed.
- Size = max of all states — the enum is as large as its biggest variant (plus discriminant). You can check with
std::mem::size_of_val(&my_future). - Self-referential problem — local variables across
.awaitmay reference each other. That's whyPinis required: the state machine can't be moved in memory after being polled.
Visualizing the Size
// Each .await adds a state variant. The future's size is the max variant.
async fn small() {
let x: u8 = 1;
tokio::time::sleep(Duration::from_secs(1)).await;
println!("{x}");
}
// Size ≈ max(size_of Sleep future + u8, ...)
async fn big() {
let buf = [0u8; 4096]; // 4KB on the state machine!
tokio::time::sleep(Duration::from_secs(1)).await;
println!("{}", buf.len());
}
// Size ≈ 4KB+ because buf lives across the .await
// Fix: Box the buffer, or don't hold it across .await
// Check sizes at runtime:
println!("small: {} bytes", std::mem::size_of_val(&small()));
println!("big: {} bytes", std::mem::size_of_val(&big()));
Pin and Self-Reference
Pin<P> is a pinning wrapper that prevents a value from being moved in memory after creation. This is critical for async because generated futures are often self-referential: they may contain internal pointers (e.g., to local variables in the state machine). If the future is moved in memory, those pointers become dangling.
By requiring Pin<&mut Future> to call poll()`, the runtime guarantees the future's address doesn't change between polls. You rarely construct Pin manually; the executor does it.
Practical rule: If you're using async/await, you don't need to think about Pin. If you're implementing custom futures, you must understand it.
Writing and Composing Async Functions
async blocks and functions: An async block returns a future. You can .await it later. An async fn is syntactic sugar for a function that returns an impl Future.
Concurrent composition: Use tokio::join! to wait for multiple futures concurrently (not sequentially). Use tokio::select! to race futures and handle the first one that completes.
use tokio;
// Async function — returns a Future
async fn fetch_url(url: &str) -> Result<String, reqwest::Error> {
let resp = reqwest::get(url).await?;
let body = resp.text().await?;
Ok(body)
}
// Concurrent execution with join! — both run concurrently
async fn fetch_both() -> (String, String) {
let (a, b) = tokio::join!(
fetch_url("https://api.example.com/a"),
fetch_url("https://api.example.com/b"),
);
(a.unwrap(), b.unwrap())
}
// select! — race multiple futures, use first one to complete
async fn race_example() {
tokio::select! {
val = fetch_url("https://api.example.com/a") => {
println!("A finished first: {val:?}");
},
val = fetch_url("https://api.example.com/b") => {
println!("B finished first: {val:?}");
},
_ = tokio::time::sleep(tokio::time::Duration::from_secs(5)) => {
println!("Timeout!");
}
}
}
// async block (doesn't execute until awaited)
let future = async {
println!("I run when the future is awaited");
42
};
// nothing prints until we await
let value = future.await; // now it runs
Async Runtimes
An async runtime (executor) is necessary to run async code. It manages a thread pool, an event loop, and wakes up tasks when I/O is ready. The most popular runtime is Tokio, but others exist:
| Runtime | Best For | Threads | Features |
|---|---|---|---|
tokio | Network services, general async Rust | Multi-threaded (configurable) | Mature, large ecosystem, timers, channels, file I/O, spawn_blocking for CPU tasks |
async-std | std-like async API | Multi-threaded | Mirrors std library closely, simpler learning curve |
smol | Minimal, embedded, or custom runtimes | Configurable | Tiny runtime, easy to customize |
embassy | Embedded systems, microcontrollers | Single-threaded or cooperative multitasking | HAL integration, no_std support |
Note on blocking: In an async context, you must not call blocking functions (like std::fs::read) directly — they block the entire executor. Instead, use tokio::task::spawn_blocking() for CPU-bound or blocking operations.
10. Closures & Iterators Core
Closure Traits: Fn, FnMut, and FnOnce
Closures in Rust capture variables from their environment. The compiler automatically determines which trait a closure implements based on how it uses captured values. This is a hierarchy:
FnOnce: Consumes (takes ownership of) captured values. Can be called at most once. Use this when the closure needs to move values.FnMut: Mutably borrows captured values. Can be called multiple times, but each call may mutate captured state. Use for closures that modify external variables.Fn: Immutably borrows captured values. Can be called multiple times without side effects. Most reusable and least restrictive.
The relationship: Every closure is FnOnce. If it doesn't move values, it's also FnMut. If it doesn't mutate, it's also Fn. In other words: Fn ⊆ FnMut ⊆ FnOnce.
Auto-deref coercion: When you pass a closure to a function expecting Fn, Rust is lenient. It will accept FnMut or FnOnce if the function only calls it once or immutably. But a function expecting FnOnce won't accept Fn (you're over-promising).
Explicit capture: Use move keyword to force a closure to take ownership of all captured variables, even if it would normally borrow.
let name = String::from("Alice");
// Fn — captures name immutably, can call many times
let greet = || println!("Hello, {}", name);
greet();
greet(); // OK: called multiple times
// FnMut — captures count mutably, modifies external state
let mut count = 0;
let mut increment = || {
count += 1; // mutable borrow
println!("Count: {}", count);
};
increment(); // Count: 1
increment(); // Count: 2
// FnOnce — takes ownership via move
let name = String::from("Bob");
let consume = move || {
println!("Name: {}", name);
drop(name); // consume/move the value
};
consume();
// consume(); // ERROR: already called (FnOnce)
// println!("{}", name); // ERROR: name was moved into closure
// Function taking FnOnce — closure will be called once, may consume
fn call_once<F: FnOnce()>(f: F) {
f();
}
// Function taking FnMut — closure may be called multiple times, may mutate
fn call_multiple<F: FnMut()>(mut f: F) {
f();
f();
}
// Function taking Fn — closure called multiple times, no side effects
fn call_as_filter<F: Fn(i32) -> bool>(nums: &[i32], predicate: F) -> Vec<i32> {
nums.iter().copied().filter(predicate).collect()
}
Iterators and Lazy Evaluation
Iterators in Rust are lazy: they don't execute until you call a consuming adapter like .collect(), .sum(), .for_each(), etc. This design enables zero-cost abstractions — the compiler can optimize iterator chains into tight loops.
Iterator adaptors (lazy) — return an iterator, don't execute yet:
.map(f)— apply closure to each element.filter(f)— keep elements wherefreturns true.take(n)— yield first n elements.skip(n)— skip first n elements.zip(other)— pair elements from two iterators.enumerate()— yield (index, element) pairs.flat_map(f)— map then flatten
Consuming adaptors (execute the iterator):
.collect()— gather into a collection (Vec, HashSet, etc.).sum()— sum numeric elements.product()— product of elements.fold(init, f)— reduce with accumulator.for_each(f)— run closure for side effects.any(f)— true if any element matches.all(f)— true if all elements match.find(f)— first element matching predicate.max(),.min()— largest/smallest element.count()— number of elements
// Lazy evaluation — nothing runs until collect()
let result: Vec<i32> = (1..=100)
.filter(|x| x % 2 == 0) // keep evens (lazy)
.map(|x| x * x) // square them (lazy)
.take(5) // first 5 (lazy)
.collect(); // [4, 16, 36, 64, 100] (execute now)
// Consuming with fold (reduce pattern)
let sum = (1..=5)
.map(|x| x * x)
.fold(0, |acc, x| acc + x); // 1+4+9+16+25 = 55
// any() and all() short-circuit
let has_even = vec![1, 2, 3].iter().any(|x| x % 2 == 0); // true (stops at 2)
// find() returns Option, stops on first match
let first_even = (1..=10).find(|x| x % 2 == 0); // Some(2)
// Custom iterator — implement Iterator trait
struct Counter { count: u32, max: u32 }
impl Iterator for Counter {
type Item = u32;
fn next(&mut self) -> Option<u32> {
if self.count < self.max {
self.count += 1;
Some(self.count)
} else {
None
}
}
}
let counter = Counter { count: 0, max: 5 };
let result: Vec<_> = counter.map(|x| x * 2).collect(); // [2, 4, 6, 8, 10]
11. Unsafe Rust Advanced
The Five Unsafe Capabilities
unsafe is not a license to ignore all rules. It only unlocks five specific capabilities that the borrow checker cannot verify statically. The borrow checker still applies inside unsafe blocks — you can't, for example, have two mutable references in a safe way even in unsafe.
The five capabilities are:
- Dereference raw pointers (
*const Tand*mut T): Raw pointers can be null or dangling; the compiler can't verify safety. - Call unsafe functions or methods: Functions marked
unsafe fnhave preconditions that only the caller can verify. - Access or modify mutable static variables: Global state is inherently unsafe (race conditions).
- Implement unsafe traits: Some traits have safety invariants (e.g.,
Send,Sync). Manual impl requires guaranteeing invariants. - Access fields of unions: Unions allow overlapping memory; field interpretation is unsafe.
Core principle: You are responsible for verifying the invariants that the compiler can't. Use unsafe sparingly, document invariants with comments, and wrap unsafe code in safe abstractions.
Raw Pointers
Raw pointers (*const T and *mut T) are created safely (e.g., from a reference via as cast), but dereferencing them requires unsafe. They can be null, dangling, or misaligned — the compiler makes no guarantees.
Use cases:
- FFI (Foreign Function Interface): C/C++ libraries return raw pointers.
- Custom data structures: Implementing data structures that need pointer manipulation (linked lists, trees with back-pointers).
- Low-level systems programming: Memory-mapped I/O, hardware access.
// Creating raw pointers is safe
let mut x = 42;
let r1 = &x as *const i32; // immutable raw pointer from reference
let r2 = &mut x as *mut i32; // mutable raw pointer
// Dereferencing requires unsafe — compiler can't verify it's valid
unsafe {
println!("Value: {}", *r1); // read through raw pointer
*r2 = 100; // write through raw pointer
}
// Null pointer (safe to create, unsafe to dereference)
let null_ptr: *const i32 = std::ptr::null();
unsafe {
// This would be undefined behavior:
// println!("{}", *null_ptr);
}
FFI (Foreign Function Interface)
extern "C" blocks declare C functions. Calling them requires unsafe because the Rust compiler can't verify the C function's safety properties or ABI compatibility.
// Declare C functions in extern block
extern "C" {
fn abs(input: i32) -> i32;
fn malloc(size: usize) -> *mut u8;
fn free(ptr: *mut u8);
}
// Calling C functions is unsafe
let result = unsafe { abs(-42) }; // 42
// Wrapping C functions in safe abstractions
fn safe_abs(x: i32) -> i32 {
unsafe { abs(x) } // we verify: abs is always safe for i32
}
Mutable Statics
Accessing or modifying static mut variables is unsafe because they introduce global mutable state, which can lead to data races in multi-threaded code. Prefer std::sync::OnceLock or parking_lot::Once for initialization, or std::sync::Mutex for mutable shared state.
// Mutable global state (avoid!)
static mut COUNTER: i32 = 0;
fn increment() {
unsafe {
COUNTER += 1;
}
}
// Better: use OnceLock for initialization
use std::sync::OnceLock;
static CONFIG: OnceLock<String> = OnceLock::new();
fn init_config(val: String) {
let _ = CONFIG.set(val); // safe, one-time initialization
}
Safe Abstractions Over Unsafe Code
The power of unsafe is in building safe abstractions. You write unsafe code carefully once, prove its correctness, and expose a safe API to callers. This is how standard library data structures are implemented.
// Safe public API, unsafe implementation details
pub fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
let len = values.len();
assert!(mid <= len, "index out of bounds");
let ptr = values.as_mut_ptr();
unsafe {
// SAFETY: We verified mid <= len, and ptr is valid for the entire slice.
// We are returning non-overlapping slices, satisfying the borrow checker.
(
std::slice::from_raw_parts_mut(ptr, mid),
std::slice::from_raw_parts_mut(ptr.add(mid), len - mid),
)
}
}
// Caller never sees unsafe — safe interface
let mut data = vec![1, 2, 3, 4];
let (left, right) = split_at_mut(&mut data, 2);
// left: &mut [1, 2], right: &mut [3, 4]
12. Macros Intermediate
Declarative Macros (macro_rules!)
Declarative macros use pattern matching on token streams to generate code at compile time. They are called with ! (e.g., vec!(), println!()). The compiler matches patterns and expands them.
Syntax: macro_rules! name { (pattern) => { expansion }; }
Pattern elements:
expr— an expressionstmt— a statementty— a typeident— an identifiertt— a token tree (any sequence of tokens)$(...)*— zero or more repetitions (similar to regex)$(,)?— optional trailing separator
Key points:
- Macros operate on token trees, not parsed code. They don't have access to type information.
- Repetition with
$(...)*and$(...)+is common. Each captured group gets a$prefix in the expansion. - Declarative macros are powerful for DSLs (domain-specific languages) but can be verbose and hard to debug.
// Simple declarative macro
macro_rules! greet {
($name:expr) => {
println!("Hello, {}", $name);
};
}
greet!("Alice"); // Hello, Alice
// Macro with repetition — hashmap constructor
macro_rules! hashmap {
($($key:expr => $val:expr),* $(,)?) => {{
let mut map = std::collections::HashMap::new();
$( map.insert($key, $val); )* // expand for each key=>val pair
map
}};
}
let scores = hashmap! {
"Alice" => 95,
"Bob" => 87,
"Charlie" => 92,
};
// Macro matching different patterns (overloading)
macro_rules! print_type {
(i32) => { println!("Type is i32"); };
(String) => { println!("Type is String"); };
($ty:ty) => { println!("Type is: {}", stringify!($ty)); };
}
print_type!(i32); // Type is i32
print_type!(String); // Type is String
// Built-in macros: stringify!, concat!, etc.
let s = stringify!(x + y); // "x + y" as string
let combined = concat!("Hello", " ", "World"); // "Hello World"
Procedural Macros
Procedural macros are Rust functions that run at compile time, transforming token streams or AST into new code. They are more powerful than declarative macros but require a separate crate (with proc-macro = true in Cargo.toml).
Three kinds of procedural macros:
- Derive macros:
#[derive(MyTrait)]— automatically implement traits. Most common. - Attribute macros:
#[my_attribute]— transform entire items (functions, structs, etc.). - Function-like macros:
my_macro!(...)— behave like declarative macros but with full AST access.
Example use cases:
#[derive(Serialize, Deserialize)]— fromserde, generates serialization code.#[tokio::main]— fromtokio, wrapsmainin async runtime setup.#[test]— from stdlib, marks functions as tests.
// Standard derive macros (no custom crate needed)
#[derive(Debug, Clone, PartialEq)]
struct Person {
name: String,
age: u32,
}
// Using derive-based traits
let p = Person { name: "Alice".into(), age: 30 };
println!("{:?}", p); // Debug trait
let p2 = p.clone(); // Clone trait
// Popular derive macros from external crates
#[derive(Debug, Serialize, Deserialize)]
struct Config {
port: u16,
host: String,
}
// Attribute macros example (tokio::main)
#[tokio::main]
async fn main() {
// Expands to: fn main() { tokio::runtime::Runtime::new().block_on(async { ... }) }
}
When to Use Each
| Type | Use Case | Complexity | Example |
|---|---|---|---|
| Declarative macro | Simple token patterns, DSLs | Low-Medium | vec!(), println!() |
| Derive macro | Automatic trait implementation | High (requires proc-macro crate) | serde::Serialize |
| Attribute macro | Transform or annotate items | High (requires proc-macro crate) | tokio::main, warp::path |
| Function-like macro | Complex code generation | High | Test frameworks, query builders |
13. Memory Layout Advanced
Struct Layout and Padding
By default, Rust uses repr(Rust) layout: the compiler may reorder fields to minimize padding and improve cache locality. This is an implementation detail and can change between compiler versions.
#[repr(C)] uses C-compatible layout, matching the rules of the C ABI. This is essential for FFI. Fields are laid out in declaration order, with padding inserted as needed for alignment.
Alignment: Each type has an alignment requirement (usually equal to its size for primitives, but can vary). The compiler pads fields to satisfy alignment.
Example:
#[repr(C)]
struct Packed { a: u8, b: u32, c: u8 }
// Layout:
// a: [u8] at offset 0 (1 byte, align 1)
//
// b: [u32] at offset 4 (4 bytes, align 4)
// c: [u8] at offset 8 (1 byte, align 1)
//
// Total: 16 bytes (aligned to u32 boundary)
struct Reordered { a: u8, b: u32, c: u8 }
// Layout (Rust may reorder):
// a: [u8] at offset 0 (1 byte)
// c: [u8] at offset 1 (1 byte)
//
// b: [u32] at offset 4 (4 bytes)
// Total: 8 bytes
To minimize size, order large fields before small ones.
Enum Layout and Discriminants
Enums store a discriminant (tag) to identify the variant, plus the largest variant's data. The discriminant is typically a small integer.
enum Result<T, E> {
Ok(T),
Err(E),
}
// Simplified layout (for 64-bit system):
// discriminant: u32 (0 = Ok, 1 = Err)
// data: max(sizeof(T), sizeof(E))
// Plus padding as needed
Rust's compiler is smart about space. It uses niche optimization to reduce enum sizes when possible.
Niche Optimization
Niche optimization means the compiler reuses invalid values of a type as discriminants. This saves space.
Option<&T>is the same size as&Tbecause references can never be null. The compiler uses null as theNonediscriminant.Option<bool>is 1 byte (not 2) becauseboolhas unused bit patterns.Option<Option<bool>>is still 1 byte — multiple levels of nesting don't add size.Option<NonZeroU32>is 4 bytes becauseNonZeroU32never contains 0.
use std::mem;
use std::num::NonZeroU32;
// Niche optimization examples
assert_eq!(mem::size_of::<&i32>(), 8); // 8 bytes
assert_eq!(mem::size_of::<Option<&i32>>(), 8); // still 8! (null = None)
assert_eq!(mem::size_of::<bool>(), 1); // 1 byte
assert_eq!(mem::size_of::<Option<bool>>(), 1); // still 1! (2 and 3 unused)
assert_eq!(mem::size_of::<Option<Option<bool>>>(), 1); // still 1!
assert_eq!(mem::size_of::<NonZeroU32>(), 4); // 4 bytes (never zero)
assert_eq!(mem::size_of::<Option<NonZeroU32>>(), 4); // still 4! (0 = None)
Zero-Sized Types (ZSTs)
A zero-sized type has size 0. The compiler must still distinguish them at compile time (for type checking), but they don't occupy runtime space.
()— the unit type- Empty structs:
struct Marker; PhantomData<T>— a zero-sized marker for ownership/lifetime tracking- Arrays of ZST:
Vec<()>— size is 24 (vec metadata), but contains no actual elements
Use cases:
- Marker types:
PhantomData<T>tells Rust "I logically own a T" without storing it. Useful in generics (e.g., lifetime tracking, variance control). - Compile-time state machines: Different empty types represent different states (see typestate pattern).
- Set implementations:
HashSet<T>is often implemented asHashMap<T, ()>.
use std::mem;
use std::marker::PhantomData;
// Zero-sized types
assert_eq!(mem::size_of::<()>(), 0);
assert_eq!(mem::size_of::<Vec<()>>(), 24); // vec metadata only
// PhantomData for compile-time tracking
struct Owned<T> {
data: *mut T,
_phantom: PhantomData<T>, // ZST, tells compiler "I own T"
}
assert_eq!(mem::size_of::<Owned<i32>>(), 8); // just the pointer, no overhead
Checking Sizes and Alignment
Use std::mem::size_of(), std::mem::align_of(), and std::mem::size_of_val() to inspect memory layout at runtime (or compile time with const).
use std::mem;
struct Example {
a: u8,
b: u32,
c: u16,
}
println!("size: {}, align: {}", mem::size_of::<Example>(), mem::align_of::<Example>());
// Offset calculation (not stabilized in std, use third-party crate if needed)
let ex = Example { a: 1, b: 2, c: 3 };
let addr_a = &ex.a as *const _ as usize;
let addr_b = &ex.b as *const _ as usize;
println!("offset of b: {}", addr_b - addr_a);
14. Design Patterns Intermediate
Newtype Pattern
// Type safety without runtime cost
struct Meters(f64);
struct Kilometers(f64);
fn travel(distance: Kilometers) { /* ... */ }
// travel(Meters(100.0)); // ERROR: expected Kilometers, got Meters
Builder Pattern
struct Server { host: String, port: u16, max_conn: usize }
struct ServerBuilder { host: String, port: u16, max_conn: usize }
impl ServerBuilder {
fn new() -> Self {
Self { host: "0.0.0.0".into(), port: 8080, max_conn: 100 }
}
fn host(mut self, host: &str) -> Self { self.host = host.into(); self }
fn port(mut self, port: u16) -> Self { self.port = port; self }
fn build(self) -> Server {
Server { host: self.host, port: self.port, max_conn: self.max_conn }
}
}
let server = ServerBuilder::new().host("localhost").port(3000).build();
Typestate Pattern
// Compile-time state machine — invalid states are unrepresentable
struct Locked;
struct Unlocked;
struct Door<State> { _state: std::marker::PhantomData<State> }
impl Door<Locked> {
fn unlock(self) -> Door<Unlocked> {
println!("Unlocking...");
Door { _state: std::marker::PhantomData }
}
}
impl Door<Unlocked> {
fn open(&self) { println!("Opening door"); }
fn lock(self) -> Door<Locked> {
Door { _state: std::marker::PhantomData }
}
}
let door = Door::<Locked> { _state: std::marker::PhantomData };
// door.open(); // ERROR: no method `open` for Door<Locked>
let door = door.unlock();
door.open(); // OK!
15. Testing Core
// Unit tests — in the same file
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_add() {
assert_eq!(add(2, 3), 5);
}
#[test]
#[should_panic(expected = "divide by zero")]
fn test_divide_by_zero() {
divide(10, 0);
}
#[test]
fn test_result() -> Result<(), Box<dyn std::error::Error>> {
let val: i32 = "42".parse()?;
assert_eq!(val, 42);
Ok(())
}
}
// Integration tests — in tests/ directory
// tests/integration_test.rs
use my_crate::public_function;
#[test]
fn integration_test() {
assert!(public_function(5) > 0);
}
// Property-based testing with proptest
use proptest::prelude::*;
proptest! {
#[test]
fn doesnt_crash(s in "\\PC*") {
let _ = parse(&s); // should never panic
}
}
16. Performance Intermediate
Zero-Cost Abstractions
Rust's high-level abstractions (iterators, closures, traits, generics) compile down to the same machine code you'd write by hand. Iterator chains like .map().filter().collect() produce optimized loop code with no heap allocations or virtual dispatch. This is achieved through monomorphization (generics) and inlining, enabling developers to write elegant, composable code without sacrificing performance.
Common Performance Tips
| Technique | When | Example |
|---|---|---|
Use &str over String | Read-only access | fn process(s: &str) |
Use Cow<str> | Maybe-clone scenarios | Parse input, clone only on modification |
Vec::with_capacity(n) | Known size | Avoid reallocations |
collect() with type hint | Iterator chains | .collect::<Vec<_>>() |
#[inline] | Small hot functions | Cross-crate inlining hint |
Avoid .clone() | Hot paths | Borrow instead |
Use HashMap::entry | Insert-or-update | Single lookup instead of two |
| Profile before optimizing | Always | cargo flamegraph, perf |
17. Ecosystem & Tooling General
| Tool | Purpose | Command |
|---|---|---|
cargo | Build system & package manager | cargo build / run / test |
rustfmt | Code formatter | cargo fmt |
clippy | Lint collection (500+ lints) | cargo clippy |
rust-analyzer | IDE/LSP support | VS Code extension |
miri | Undefined behavior detector | cargo +nightly miri run |
cargo-expand | View macro expansion | cargo expand |
cargo-flamegraph | CPU profiling | cargo flamegraph |
cargo-audit | Security vulnerability scan | cargo audit |
cargo-deny | License & dependency policy | cargo deny check |
Must-Know Crates
| Category | Crate | Purpose |
|---|---|---|
| Serialization | serde + serde_json | De/serialization framework |
| Async Runtime | tokio | Async runtime + utilities |
| HTTP Client | reqwest | Ergonomic HTTP client |
| Web Framework | axum / actix-web | Web server frameworks |
| CLI | clap | Command-line argument parsing |
| Error Handling | thiserror / anyhow | Error types and context |
| Logging | tracing | Structured logging + spans |
| Database | sqlx / diesel | Async SQL / ORM |
| Testing | proptest / rstest | Property-based / parameterized tests |
| Parallelism | rayon | Data parallelism for iterators |
18. Coding Challenges Practice
Challenge 1: Implement a Thread-Safe Cache
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use std::hash::Hash;
#[derive(Clone)]
struct Cache<K, V> {
store: Arc<RwLock<HashMap<K, V>>>,
}
impl<K: Eq + Hash + Clone, V: Clone> Cache<K, V> {
fn new() -> Self {
Self { store: Arc::new(RwLock::new(HashMap::new())) }
}
fn get(&self, key: &K) -> Option<V> {
self.store.read().unwrap().get(key).cloned()
}
fn set(&self, key: K, value: V) {
self.store.write().unwrap().insert(key, value);
}
fn get_or_insert_with(&self, key: K, f: impl FnOnce() -> V) -> V {
// Try read first (no write lock needed)
if let Some(val) = self.store.read().unwrap().get(&key) {
return val.clone();
}
// Cache miss — acquire write lock
let mut store = self.store.write().unwrap();
// Double-check (another thread may have inserted)
store.entry(key).or_insert_with(f).clone()
}
}
Challenge 2: Flatten a Nested Iterator
fn flatten<I>(iter: I) -> Flatten<I>
where
I: Iterator,
I::Item: IntoIterator,
{
Flatten { outer: iter, inner: None }
}
struct Flatten<I: Iterator>
where I::Item: IntoIterator
{
outer: I,
inner: Option<<I::Item as IntoIterator>::IntoIter>,
}
impl<I> Iterator for Flatten<I>
where
I: Iterator,
I::Item: IntoIterator,
{
type Item = <I::Item as IntoIterator>::Item;
fn next(&mut self) -> Option<Self::Item> {
loop {
if let Some(ref mut inner) = self.inner {
if let Some(item) = inner.next() {
return Some(item);
}
}
let next_inner = self.outer.next()?.into_iter();
self.inner = Some(next_inner);
}
}
}
// Usage: flatten(vec![vec![1,2], vec![3,4]].into_iter()) → [1,2,3,4]
Challenge 3: Implement a Simple Linked List
type Link<T> = Option<Box<Node<T>>>;
struct Node<T> { val: T, next: Link<T> }
struct List<T> { head: Link<T> }
impl<T> List<T> {
fn new() -> Self { Self { head: None } }
fn push(&mut self, val: T) {
let new_node = Box::new(Node {
val,
next: self.head.take(), // take ownership of old head
});
self.head = Some(new_node);
}
fn pop(&mut self) -> Option<T> {
self.head.take().map(|node| {
self.head = node.next;
node.val
})
}
fn peek(&self) -> Option<&T> {
self.head.as_ref().map(|node| &node.val)
}
}
// Proper Drop to avoid stack overflow on deep lists
impl<T> Drop for List<T> {
fn drop(&mut self) {
let mut cur = self.head.take();
while let Some(mut node) = cur {
cur = node.next.take(); // iterative drop, not recursive
}
}
}
19. System Design in Rust Senior
When to Choose Rust Over Other Languages
Rust excels in several scenarios where other languages fall short:
- Memory safety without GC pauses: Embedded systems and real-time applications require predictable performance without garbage collection overhead.
- High performance + safety: Network services, databases, and game engines benefit from Rust's zero-cost abstractions and compile-time guarantees.
- Fearless concurrency: Data-parallel processing and concurrent algorithms are easier to reason about with Rust's ownership system.
- Long-running services: Services where GC pauses are unacceptable (trading systems, low-latency services) need Rust's deterministic performance.
- WebAssembly targets: Rust's WASM support with excellent tooling (wasm-pack, wasm-bindgen) makes it ideal for performance-critical browser code.
- Security-critical code: Cryptography, OS kernel modules, and trusted computing require the memory safety guarantees that only Rust provides.
Rust in Production: Common Architectures
| Use Case | Stack | Companies |
|---|---|---|
| Web API | Axum + SQLx + Tokio | Cloudflare, Discord |
| CLI Tools | Clap + Serde + Crossterm | Starship, ripgrep, bat, fd |
| Systems / Infra | Custom + Tokio + Serde | AWS (Firecracker), Meta, Dropbox |
| Embedded | no_std + embassy + HAL | Oxide Computer, Framework Laptop |
| Blockchain | Substrate / Solana runtime | Polkadot, Solana |
| Databases | Custom B-tree + io_uring | TiKV, SurrealDB, Qdrant |
| WebAssembly | wasm-bindgen + wasm-pack | Figma, 1Password |
20. Cheat Sheet
Quick Reference: Ownership & Borrowing
| Scenario | Code | What Happens |
|---|---|---|
| Move | let b = a; | a is invalid (non-Copy types) |
| Copy | let b = a; | a is still valid (Copy types) |
| Immutable borrow | let r = &a; | Multiple allowed simultaneously |
| Mutable borrow | let r = &mut a; | Exclusive — no other borrows allowed |
| Clone | let b = a.clone(); | Deep copy — both valid |
Quick Reference: Common Conversions
| From | To | Method |
|---|---|---|
&str | String | s.to_string() or String::from(s) or s.to_owned() |
String | &str | &s or s.as_str() |
&str | i32 | s.parse::<i32>()? |
i32 | String | n.to_string() |
Vec<T> | &[T] | &v or v.as_slice() |
&[T] | Vec<T> | s.to_vec() |
Option<T> | Result<T, E> | opt.ok_or(err)? |
Result<T, E> | Option<T> | res.ok() |
Quick Reference: Lifetime Annotations
| Syntax | Meaning |
|---|---|
&'a T | Reference valid for at least lifetime 'a |
&'static T | Reference valid for entire program |
T: 'a | T contains no references shorter than 'a |
T: 'static | T owns all its data (or refs are 'static) |
for<'a> | Higher-ranked trait bound (HRTB) — works for ANY lifetime |
21. Latest Rust Features (2025–2026)
Stable: 1.94.0 (Mar 5, 2026) · Beta: 1.95.0 (Apr 16, 2026) · Nightly: 1.96.0 (May 28, 2026)
Rust Edition 2024 (Shipped with 1.85.0)
The largest edition since 2018. Key changes that require edition = "2024" in Cargo.toml:
| Change | What It Does | Impact |
|---|---|---|
unsafe_op_in_unsafe_fn warn-by-default | Unsafe operations inside unsafe fn now require explicit unsafe { } blocks | Better unsafe code auditing |
| RPIT lifetime capture reform | impl Trait return types now capture all in-scope lifetimes by default | Use + use<'a> to opt out |
if let temporary scope changes | Temporaries in if let are dropped at end of the if block, not the statement | Fewer surprising borrow errors |
| Tail expression temporary scope | Temporaries in tail expressions live shorter | Prevents dangling ref bugs |
unsafe extern blocks | All items in extern blocks require unsafe annotation | Explicit FFI safety |
gen keyword reserved | Reserved for future generator syntax | Cannot use as identifier |
| Rustfmt style editions | Formatting style independent of Rust edition | Easier upgrades for large codebases |
| Rust-version-aware resolver | Cargo picks dependency versions compatible with your MSRV | Fewer build failures |
Stable Release Highlights: 1.85 → 1.94
1.85.0 — Feb 20, 2025 (Edition 2024)
// Async closures — first-class async || {} that properly capture references
let db = &database;
let fetch = async || {
db.query("SELECT * FROM users").await
};
// Works with the new AsyncFn traits
async fn retry<F: AsyncFn() -> Option<T>, T>(f: F, attempts: u32) -> Option<T> {
for _ in 0..attempts {
if let Some(val) = f().await { return Some(val); }
}
None
}
// #[diagnostic::do_not_recommend] — hide unhelpful trait suggestions
#[diagnostic::do_not_recommend]
impl<T: Into<String>> MyTrait for T {}
1.86.0 — Apr 3, 2025
// Trait upcasting — coerce dyn SubTrait to dyn SuperTrait automatically
trait Base { fn name(&self) -> &str; }
trait Extended: Base { fn extra(&self); }
fn use_base(obj: &dyn Base) { println!("{}", obj.name()); }
fn example(ext: &dyn Extended) {
use_base(ext); // ✅ automatic upcasting! (was error before 1.86)
}
// Disjoint mutable indexing — multiple &mut at different indices
let mut v = vec![1, 2, 3, 4, 5];
let [a, _, b] = v.get_disjoint_mut([0, 2, 4]).unwrap(); // three &mut at once!
*a = 10; *b = 50;
// Safe #[target_feature] on safe functions
#[target_feature(enable = "avx2")]
fn fast_sum(data: &[f32]) -> f32 { /* SIMD code, no unsafe needed */ }
// New APIs: Vec::pop_if, Once::wait, f64::next_up/next_down
let mut v = vec![1, 2, 3];
let popped = v.pop_if(|&mut x| x > 2); // Some(3)
1.87.0 — May 15, 2025
// Vec::extract_if — drain elements matching a predicate
let mut v = vec![1, 2, 3, 4, 5, 6];
let evens: Vec<_> = v.extract_if(.., |x| *x % 2 == 0).collect();
// evens = [2, 4, 6], v = [1, 3, 5]
1.88.0 — Jun 26, 2025
// let chains in if/while — stabilized for Edition 2024
if let Some(x) = opt && x > 0 && let Some(y) = compute(x) {
process(x, y);
}
while let Some(item) = iter.next() && item.is_valid() {
handle(item);
}
// Naked functions — full assembly control with no prologue/epilogue
#[unsafe(naked)]
unsafe extern "C" fn context_switch() {
core::arch::naked_asm!(
"push rbp",
"mov rbp, rsp",
"pop rbp",
"ret",
);
}
// Cell::update — modify in place without get+set
use std::cell::Cell;
let c = Cell::new(5);
c.update(|v| v + 1); // c is now 6
// Cargo auto-cleans its cache (gc)
1.89.0 — Aug 7, 2025
// #[repr(u128)] / #[repr(i128)] — 128-bit discriminants for enums
#[repr(u128)]
enum BigEnum {
A = 1 << 100,
B = 1 << 127,
}
// Explicitly inferred const arguments
fn make_array<const N: usize>() -> [u8; N] { [0; N] }
let arr = make_array::<_>(); // compiler infers N from context
// File locking — cross-platform advisory file locks
use std::fs::File;
let f = File::open("data.lock")?;
f.lock()?; // blocking exclusive lock
f.try_lock()?; // non-blocking
f.lock_shared()?; // shared (read) lock
f.unlock()?;
// AVX-512 + SHA-512 + SM3/SM4 target features stabilized
1.90.0 — Sep 18, 2025
// lld is now the default linker on x86_64-unknown-linux-gnu
// → Significantly faster link times for Linux builds
// No code changes needed — just upgrade Rust
1.91.0 — Oct 30, 2025
// Atomic pointer arithmetic operations — lock-free data structures
use std::sync::atomic::{AtomicPtr, Ordering};
let ptr = AtomicPtr::new(base_ptr);
ptr.fetch_ptr_add(4, Ordering::Relaxed); // atomic pointer offset
ptr.fetch_byte_add(16, Ordering::Relaxed); // byte-level offset
// Duration convenience constructors
use std::time::Duration;
let d = Duration::from_hours(2); // new!
let d = Duration::from_mins(30); // new!
// Path::file_prefix — stem before first dot
use std::path::Path;
let p = Path::new("archive.tar.gz");
assert_eq!(p.file_prefix(), Some("archive".as_ref()));
// vs file_stem() → "archive.tar"
// C-style variadic functions for sysv64, win64, efiapi ABIs
// Cargo build.build-dir config for custom artifact directories
1.92.0 — Dec 11, 2025
// RwLockWriteGuard::downgrade — writer → reader without releasing
use std::sync::RwLock;
let lock = RwLock::new(vec![1, 2, 3]);
let mut write = lock.write().unwrap();
write.push(4);
let read = write.downgrade(); // ✅ atomic downgrade, no gap
println!("{:?}", &*read);
// Zero-initialized smart pointers
let b: Box<[u8; 4096]> = Box::new_zeroed(); // all zeros, no stack copy
let b = unsafe { b.assume_init() };
// Also for Rc and Arc
let arc_slice: Arc<[u8]> = Arc::new_zeroed_slice(1024);
// BTreeMap entry improvements
use std::collections::BTreeMap;
let mut map = BTreeMap::new();
let entry = map.entry("key").insert_entry(42); // returns OccupiedEntry
1.93.0 — Jan 22, 2026
// C-style variadic functions for system ABI — stabilized
pub unsafe extern "system" fn variadic_fn(fmt: *const c_char, ...) { }
// asm_cfg — use #[cfg] inside asm! blocks
core::arch::asm!(
#[cfg(target_arch = "x86_64")]
"nop",
#[cfg(target_arch = "aarch64")]
"nop",
);
// Const-eval: pointer copying byte-by-byte in const context
// s390x vector features + is_s390x_feature_detected! macro
1.94.0 — Mar 5, 2026 (Current Stable)
// array_windows — const-sized sliding windows over slices
let data = [1, 2, 3, 4, 5];
for window in data.array_windows::<3>() {
// window is &[i32; 3], not &[i32]
println!("{window:?}"); // [1,2,3], [2,3,4], [3,4,5]
}
// element_offset — find position of element by reference
let v = vec![10, 20, 30];
let r = &v[1];
assert_eq!(v.element_offset(r), Some(1));
// LazyCell/LazyLock accessor methods
use std::sync::LazyLock;
static CONFIG: LazyLock<String> = LazyLock::new(|| "loaded".into());
let val: Option<&String> = LazyLock::get(&CONFIG); // None if not yet init'd
LazyLock::force(&CONFIG);
let val = LazyLock::get(&CONFIG); // Some("loaded")
// Peekable::next_if_map
let mut iter = vec![1, 2, 3].into_iter().peekable();
let doubled = iter.next_if_map(|&x| if x < 3 { Some(x * 2) } else { None });
// Mathematical constants: EULER_GAMMA and GOLDEN_RATIO
use std::f64::consts::{EULER_GAMMA, GOLDEN_RATIO};
println!("γ = {EULER_GAMMA}"); // 0.5772156649...
println!("φ = {GOLDEN_RATIO}"); // 1.6180339887...
// f32/f64::mul_add now const-stable
const RESULT: f64 = (3.0_f64).mul_add(4.0, 5.0); // 17.0 at compile time
// Unicode 17 support
// AVX-512 FP16 + AArch64 NEON FP16 intrinsics stabilized
// Cargo: include config key, TOML v1.1 parsing, pubtime in registry
Nightly / Upcoming Features (1.95–1.96+)
gen Blocks — Generator-Based Iterators
Create iterators without boilerplate structs. The gen keyword turns a block into an iterator using yield. Still on nightly; RFC 3513 accepted, active development ongoing:
#![feature(gen_blocks)]
fn fibonacci() -> impl Iterator<Item = u64> {
gen {
let (mut a, mut b) = (0, 1);
loop {
yield a;
(a, b) = (b, a + b);
}
}
}
// async gen blocks for async streams
async gen {
for url in urls {
yield reqwest::get(url).await?.text().await?;
}
}
Tail Calls via become
#![feature(explicit_tail_calls)]
fn factorial(n: u64, acc: u64) -> u64 {
if n <= 1 { acc }
else { become factorial(n - 1, n * acc) } // guaranteed TCO
}
Pin Ergonomics
#![feature(pin_ergonomics)]
// &pin mut T replaces Pin<&mut T>
async fn process(data: &pin mut MyFuture) { data.poll().await; }
// Auto-reborrowing: reuse Pin without Pin::as_mut()
fn takes_pin(x: Pin<&mut Foo>) {
helper(x); // auto re-pin
helper(x); // can reuse!
}
Reflection (Tracking Issue #142577)
#![feature(reflection)]
// Compile-time type introspection for derive macros and serialization
// Early stage — API still evolving
Sized Hierarchy & Scalable Vectors (2026 Goal)
Stabilizing a Sized trait hierarchy as a standalone win that unblocks extern type. SVE (Scalable Vector Extension) intrinsics continuing as a nightly experiment.
Other Nightly Features in Progress
| Feature | Flag | Status |
|---|---|---|
Never type ! coercions | never_type | Partially stabilized; full coercion rules in nightly |
| Struct field defaults | default_field_values | Provide default values for individual struct fields |
| Precise capturing | use<..> syntax | Stabilized in 1.85 for RPIT; expanding to TAIT |
| Target-spec JSON | target-spec-json | Flexible custom target specifications (tracking #151528) |
| Contracts / Pre/Post | contracts | Function pre/post-condition annotations |
Ecosystem Milestones (2025–2026)
| Milestone | Details |
|---|---|
| Rust in Linux Kernel | Mainline since 6.1; driver subsystems expanding (net, GPU, FS). Production Rust drivers in distros. |
| Ferrocene (Safety-Critical) | ISO 26262 + IEC 61508 certified compiler by AdaCore/Ferrous Systems. Aerospace & medical adoption growing. |
| crates.io at 170K+ | Over 170,000 published crates. Ecosystem maturity accelerating. |
| Tree Borrows (PLDI 2025) | New aliasing model — more permissive than Stacked Borrows, validated in Miri. |
| Miri (POPL 2026) | Formal verification paper on Rust's operational semantics interpreter. |
| lld default linker | Rust 1.90+ uses lld on Linux x86_64 — dramatically faster link times. |
| cargo-semver-checks | Official integration for detecting semver violations in library releases. |
22. Advanced Deep Dive
Variance & Subtyping
Rust has subtyping only through lifetimes. Variance determines how type constructors relate to lifetime subtyping:
// If 'long: 'short (long outlives short), then:
// &'long T can be used where &'short T is expected (covariant in 'a)
// &'a mut T is INVARIANT in T — cannot substitute
// Covariant: &'a T, *const T, Vec<T>, Box<T>, Option<T>
// Contravariant: fn(T) (in argument position)
// Invariant: &'a mut T (in T), Cell<T>, UnsafeCell<T>
// Why invariance matters:
fn evil(input: &mut &'static str, short: &str) {
// If &mut T were covariant in T, this would compile:
// *input = short; // Would let a &'static str point to short-lived data!
// Rust prevents this by making &mut T invariant in T.
}
// PhantomData controls variance
use std::marker::PhantomData;
struct Deserializer<'de> {
data: *const u8,
_marker: PhantomData<&'de [u8]>, // Covariant in 'de
}
struct MutRef<'a, T> {
ptr: *mut T,
_marker: PhantomData<&'a mut T>, // Invariant in T and 'a
}
// Contravariance example
struct ContravariantLifetime<'a> {
_marker: PhantomData<fn(&'a ())>, // Contravariant in 'a
}
Higher-Ranked Trait Bounds (HRTBs)
// "for any lifetime 'a, F must implement Fn(&'a str) -> &'a str"
fn apply_to_ref<F>(f: F, s: &str) -> &str
where
F: for<'a> Fn(&'a str) -> &'a str,
{
f(s)
}
// Most common in closure bounds — Fn(&T) desugars to for<'a> Fn(&'a T)
fn filter_refs<F>(data: &[String], predicate: F) -> Vec<&str>
where
F: for<'a> Fn(&'a str) -> bool,
{
data.iter().filter(|s| predicate(s)).map(|s| s.as_str()).collect()
}
// HRTB with traits
trait Processor {
fn process<'a>(&self, input: &'a [u8]) -> &'a [u8];
}
fn run<P: Processor>(p: &P, data: &[u8]) -> Vec<u8> {
p.process(data).to_vec()
}
Generic Associated Types (GATs)
// GATs allow associated types to have their own generic parameters
trait LendingIterator {
type Item<'a> where Self: 'a;
fn next<'a>(&'a mut self) -> Option<Self::Item<'a>>;
}
// Classic use case: iterator that yields references to self
struct WindowsMut<'w, T> {
data: &'w mut [T],
pos: usize,
size: usize,
}
impl<'w, T> LendingIterator for WindowsMut<'w, T> {
type Item<'a> = &'a mut [T] where Self: 'a;
fn next<'a>(&'a mut self) -> Option<Self::Item<'a>> {
if self.pos + self.size > self.data.len() {
return None;
}
let slice = &mut self.data[self.pos..self.pos + self.size];
self.pos += 1;
Some(slice)
}
}
// GATs with type parameters
trait Collection {
type Iter<'a, T: 'a>: Iterator<Item = &'a T> where Self: 'a;
}
// GAT-powered async trait (pre-AFIT pattern)
trait AsyncIterator {
type Item;
type Future<'a>: Future<Output = Option<Self::Item>> + 'a where Self: 'a;
fn next(&mut self) -> Self::Future<'_>;
}
Type-Level Programming
// Typestate pattern — encoding state in the type system
struct Locked;
struct Unlocked;
struct Door<State> {
_state: PhantomData<State>,
}
impl Door<Locked> {
fn unlock(self) -> Door<Unlocked> {
Door { _state: PhantomData }
}
}
impl Door<Unlocked> {
fn lock(self) -> Door<Locked> {
Door { _state: PhantomData }
}
fn open(&self) { println!("Door is open"); }
}
// Compile-time dimensional analysis
struct Length<const METERS: i32, const SECONDS: i32>(f64);
type Meters = Length<1, 0>;
type Seconds = Length<0, 1>;
type MetersPerSecond = Length<1, -1>;
// Peano numbers at the type level
struct Zero;
struct Succ<N>(PhantomData<N>);
trait Nat { const VALUE: usize; }
impl Nat for Zero { const VALUE: usize = 0; }
impl<N: Nat> Nat for Succ<N> { const VALUE: usize = N::VALUE + 1; }
type One = Succ<Zero>;
type Two = Succ<One>;
type Three = Succ<Two>;
Drop Check & PhantomData Patterns
// Drop check: compiler ensures T is alive when a type implementing Drop is dropped
// The #[may_dangle] attribute opts out of this check for specific parameters
unsafe impl<#[may_dangle] T> Drop for MyVec<T> {
fn drop(&mut self) {
// We promise not to access T during drop
// Only deallocating memory, not reading T values
unsafe { dealloc(self.ptr as *mut u8, self.layout()); }
}
}
// PhantomData patterns summary:
// PhantomData<T> — owns T, drops T, covariant
// PhantomData<&'a T> — borrows T, covariant in both 'a and T
// PhantomData<&'a mut T> — mutably borrows T, covariant in 'a, invariant in T
// PhantomData<fn(T)> — contravariant in T
// PhantomData<fn() -> T> — covariant in T, no drop
// PhantomData<*const T> — covariant, no Send/Sync
// PhantomData<*mut T> — invariant, no Send/Sync
Auto Traits & Negative Impls
// Auto traits: Send, Sync, Unpin, UnwindSafe, RefUnwindSafe
// Automatically implemented unless a field opts out
// Why Rc<T> is not Send:
// Rc uses NonNull<RcBox<T>> internally
// NonNull<T> is !Send and !Sync
// So Rc<T> inherits !Send, !Sync
// Negative impls (nightly or via PhantomData)
// impl !Send for MyType {} // nightly
struct NotSend {
_marker: PhantomData<*const ()>, // stable way to be !Send + !Sync
}
// Asserting traits at compile time
fn assert_send<T: Send>() {}
fn assert_sync<T: Sync>() {}
fn check_traits() {
assert_send::<Vec<i32>>(); // OK
assert_sync::<Arc<Mutex<i32>>>(); // OK
// assert_send::<Rc<i32>>(); // FAILS — Rc is !Send
// assert_sync::<Cell<i32>>(); // FAILS — Cell is !Sync
}
Orphan Rules & Coherence
// The orphan rule: you can only impl a trait for a type if:
// 1. The trait is defined in your crate, OR
// 2. The type is defined in your crate
//
// At least one of (trait, type) must be local.
// ALLOWED: local trait on foreign type
trait MyTrait {}
impl MyTrait for Vec<i32> {}
// ALLOWED: foreign trait on local type
struct MyType;
impl Display for MyType { /* ... */ }
// FORBIDDEN: foreign trait on foreign type
// impl Display for Vec<i32> {} // ERROR!
// Workaround: the newtype pattern
struct Wrapper(Vec<i32>);
impl Display for Wrapper {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "[{}]", self.0.iter()
.map(|n| n.to_string())
.collect::<Vec<_>>()
.join(", "))
}
}
// Fundamental types (#[fundamental]) relax the orphan rule:
// &T, &mut T, Box<T>, Pin<T> are fundamental
// So you CAN do: impl MyTrait for &ForeignType {}
Advanced Unsafe: Aliasing, Provenance & Stacked/Tree Borrows
// Stacked Borrows — the original aliasing model for unsafe Rust
// Each memory location has a "borrow stack":
// - &mut pushes Unique
// - & pushes SharedReadOnly
// - Raw pointers push SharedReadWrite
// Access must match the top of the stack; violations are UB
// Tree Borrows (2025) — more permissive replacement
// Instead of a stack, uses a tree structure
// Allows more interleaving of raw pointer access
// Still detected by Miri
// Example: this is UB under Stacked Borrows but may be valid under Tree Borrows
fn example() {
let mut x = 42;
let ptr1 = &mut x as *mut i32;
let ptr2 = ptr1; // derived from ptr1
unsafe {
*ptr2 = 10; // Access via ptr2
*ptr1 = 20; // Access via ptr1 — invalidates ptr2 under Stacked Borrows
// *ptr2 = 30; // UB under Stacked Borrows; OK under Tree Borrows
}
}
// Pointer provenance: where a pointer "came from" determines what it can access
use std::ptr;
fn provenance_example() {
let a = 1u32;
let b = 2u32;
let ptr_a: *const u32 = &a;
let ptr_b: *const u32 = &b;
// Even if ptr_a and ptr_b happen to be adjacent in memory,
// you CANNOT use pointer arithmetic on ptr_a to reach b.
// The provenance of ptr_a only covers 'a'.
// Use strict_provenance APIs:
let addr = ptr_a.addr(); // Extract address (usize)
let new_ptr = ptr_b.with_addr(addr); // Attach addr to ptr_b's provenance
}
// Always validate unsafe code with Miri:
// cargo +nightly miri run
// cargo +nightly miri test
Advanced Const Generics
// Type-safe matrix multiplication with const generics
struct Matrix<const R: usize, const C: usize> {
data: [[f64; C]; R],
}
impl<const R: usize, const C: usize> Matrix<R, C> {
fn multiply<const C2: usize>(&self, rhs: &Matrix<C, C2>) -> Matrix<R, C2> {
let mut result = [[0.0; C2]; R];
for i in 0..R {
for j in 0..C2 {
for k in 0..C {
result[i][j] += self.data[i][k] * rhs.data[k][j];
}
}
}
Matrix { data: result }
}
}
// Dimension mismatch is a compile-time error:
let a: Matrix<2, 3> = /* ... */;
let b: Matrix<3, 4> = /* ... */;
let c: Matrix<2, 4> = a.multiply(&b); // OK: (2x3) * (3x4) = (2x4)
// let d = a.multiply(&a); // ERROR: (2x3) * (2x3) — dimension mismatch!
// Const generic bounds (nightly)
#![feature(generic_const_exprs)]
fn split_array<T, const N: usize, const M: usize>(
arr: [T; N + M],
) -> ([T; N], [T; M]) {
// Split array at compile-time-known boundary
todo!()
}
Advanced Async Patterns
use std::pin::Pin;
use std::future::Future;
use std::task::{Context, Poll};
// Manual Future implementation
struct Delay {
when: Instant,
}
impl Future for Delay {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
if Instant::now() >= self.when {
Poll::Ready(())
} else {
let waker = cx.waker().clone();
let when = self.when;
std::thread::spawn(move || {
std::thread::sleep(when - Instant::now());
waker.wake();
});
Poll::Pending
}
}
}
// Select pattern — race multiple futures
use tokio::select;
async fn timeout_fetch(url: &str) -> Result<String, &'static str> {
select! {
result = reqwest::get(url) => {
Ok(result.map_err(|_| "fetch error")?.text().await.map_err(|_| "read error")?)
}
_ = tokio::time::sleep(Duration::from_secs(5)) => {
Err("timeout")
}
}
}
// Cancellation-safe streams with pin_project
use pin_project::pin_project;
#[pin_project]
struct Throttle<S> {
#[pin]
stream: S,
#[pin]
delay: Option<tokio::time::Sleep>,
interval: Duration,
}
// Self-referential async state machines with Pin
struct AsyncStateMachine {
state: State,
buffer: Vec<u8>,
// Can't hold a reference to buffer without Pin
}
Specialization (Nightly)
#![feature(specialization)]
trait Log {
fn log(&self);
}
// Default implementation for all Display types
impl<T: Display> Log for T {
default fn log(&self) {
println!("{}", self);
}
}
// Specialized implementation for String — overrides the default
impl Log for String {
fn log(&self) {
println!("[String] {}", self);
}
}
// min_specialization: a safer subset (used internally by std)
#![feature(min_specialization)]
// Only allows specializing with marker traits and specific patterns
23. Difficult Interview Questions
Q1: Why does this code fail to compile? How do you fix it?
fn longest(x: &str, y: &str) -> &str {
if x.len() > y.len() { x } else { y }
}
A: The return type &str has an elided lifetime, but the compiler can't determine whether the return borrows from x or y. Lifetime elision only works when there's one input reference or &self. Fix: add explicit lifetimes tying the output to both inputs:
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() { x } else { y }
}
// Now the compiler knows the return value lives at least as long as
// the shorter of the two input lifetimes.
Q2: Explain the difference between &T, &mut T, *const T, and *mut T. When would you use raw pointers?
A: &T (shared ref) guarantees no mutation and the referent is alive. &mut T (exclusive ref) guarantees exclusive access and liveness. Both are checked at compile time by the borrow checker. *const T and *mut T (raw pointers) have no aliasing or liveness guarantees — they can be null, dangling, or aliased freely. Creating raw pointers is safe; dereferencing them requires unsafe. Use cases include FFI (C interop), building data structures the borrow checker can't express (doubly-linked lists, graphs), and performance-critical inner loops where bounds checks are eliminated.
Q3: What is the difference between Box<dyn Trait> and impl Trait? When would you choose each?
A: impl Trait uses static dispatch (monomorphization) — the compiler generates specialized code for each concrete type. Zero runtime cost but increases binary size. Box<dyn Trait> uses dynamic dispatch via a vtable (fat pointer: data ptr + vtable ptr). Has indirection overhead (~1-3ns) and heap allocation but allows heterogeneous collections and reduces code bloat. Choose impl Trait for performance-critical paths with known types. Choose dyn Trait for plugin architectures, trait objects in collections, or when you need to return different concrete types from match arms.
Q4: Why is Rc<T> not Send? Why is RefCell<T> not Sync? What should you use instead in multithreaded code?
A: Rc<T> uses non-atomic reference counting. If two threads increment/decrement the count simultaneously, it's a data race (UB). Use Arc<T> (atomic ref count) instead. RefCell<T> tracks borrows at runtime with a non-atomic counter. Concurrent borrow checks would race. Use Mutex<T> or RwLock<T> for thread-safe interior mutability. For lock-free scenarios, use AtomicU64, crossbeam structures, or parking_lot.
Q5: What does Pin<Box<T>> guarantee? Why do async/await futures need pinning?
A: Pin<P> guarantees the pointee will not be moved in memory (unless T: Unpin). Async futures compiled from async fn are state machines that may hold references to their own local variables (self-referential). If the future is moved, those internal references become dangling. Pin prevents this. Unpin types (most types) can be freely moved even when pinned. Self-referential types must be !Unpin to enforce the pin guarantee. In practice, you see Pin<Box<dyn Future>> when using Box::pin() to heap-allocate a future and guarantee its address stability.
Q6: Explain this code — why does it compile, and what pattern does it demonstrate?
fn apply<F: FnOnce() -> String>(f: F) -> String { f() }
fn apply_ref<F: Fn() -> String>(f: &F) -> String { f() }
let s = String::from("hello");
let closure = move || s; // FnOnce — consumes s
apply(closure);
// apply_ref(&closure); // ERROR: closure is FnOnce, not Fn
A: The move || s closure captures s by value and then returns it, consuming the captured s. This means it can only be called once — it implements FnOnce but not Fn or FnMut. The hierarchy is FnOnce ⊃ FnMut ⊃ Fn. A closure that moves out of its captures is FnOnce-only. If it mutates captures without moving, it's FnMut. If it only reads, it's Fn. Calling apply_ref fails because &F requires F: Fn() (callable multiple times through a shared ref).
Q7: What is the "iterator invalidation" problem? How does Rust prevent it?
A: In C++, modifying a collection while iterating (push/erase) invalidates iterators, causing UB. Rust prevents this through the borrow checker: calling iter() borrows the collection immutably, so push() (which requires &mut self) is rejected at compile time. Even iter_mut() only gives mutable references to elements, not the collection itself. To modify during iteration, use patterns like retain(), drain_filter(), or collect into a new container.
Q8: What's the difference between String, &str, &[u8], OsStr, CStr, and Path? When do you use each?
A: String: owned, heap-allocated, guaranteed valid UTF-8. &str: borrowed slice of UTF-8 bytes. &[u8]: arbitrary byte slice, no encoding guarantee. OsStr/OsString: OS-native string (WTF-8 on Windows, bytes on Unix) — for filenames and env vars that might not be valid UTF-8. CStr/CString: null-terminated byte string for FFI with C. Path/PathBuf: wraps OsStr with path manipulation methods (join, extension, parent). Rule of thumb: &str for text APIs, Path for filesystem, CStr for FFI, OsStr when the OS dictates encoding.
Q9: How do you create a safe abstraction over unsafe code? What invariants must you maintain?
A: The key principle is that safe code must never cause UB, regardless of input. Your unsafe block must uphold all safety invariants expected by Rust: no aliasing &mut references, no dangling references, no data races, no uninitialized reads, and no violation of type invariants. The pattern:
pub struct SafeWrapper {
ptr: *mut u8,
len: usize,
cap: usize,
}
impl SafeWrapper {
/// SAFETY: All public methods maintain these invariants:
/// 1. ptr is non-null and points to a valid allocation of `cap` bytes
/// 2. First `len` bytes are initialized
/// 3. cap >= len
/// 4. No aliasing &mut references exist to the underlying data
pub fn push(&mut self, byte: u8) {
if self.len == self.cap {
self.grow(); // Private, maintains invariants
}
// SAFETY: len < cap after grow(), so ptr.add(len) is in bounds
unsafe { self.ptr.add(self.len).write(byte); }
self.len += 1;
}
}
Always document safety invariants, minimize unsafe scope, validate with Miri, and consider using #[deny(unsafe_op_in_unsafe_fn)].
Q10: Explain how Send and Sync work. Can you implement them manually? When would you?
A: Send: safe to transfer ownership to another thread. Sync: safe to share references between threads (T: Sync ⟺ &T: Send). Both are auto-traits — the compiler implements them if all fields are Send/Sync. You can manually impl them with unsafe impl Send for MyType {} when you've used raw pointers or FFI types that the compiler can't verify but you know are safe. Example: a custom allocator wrapping *mut u8 that uses proper synchronization internally. Never impl Send/Sync unless you can prove thread safety — incorrect impls enable data races (UB).
Q11: What happens when you call .await on a future? Describe the state machine transformation.
A: The compiler transforms the async fn into an enum state machine. Each .await point becomes a variant holding the local variables alive at that point. When polled, the future resumes from the last yield point. Example:
async fn example() -> i32 {
let a = fetch().await; // yield point 1
let b = process(a).await; // yield point 2
a + b
}
// Becomes roughly:
enum ExampleFuture {
Start,
AfterFetch { fetch_fut: FetchFuture },
AfterProcess { a: i32, proc_fut: ProcessFuture },
Done,
}
impl Future for ExampleFuture {
type Output = i32;
fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<i32> {
loop {
match self.state {
Start => { /* create fetch_fut, transition */ }
AfterFetch { ref mut fetch_fut } => {
let a = ready!(Pin::new(fetch_fut).poll(cx));
// transition to AfterProcess
}
AfterProcess { a, ref mut proc_fut } => {
let b = ready!(Pin::new(proc_fut).poll(cx));
return Poll::Ready(a + b);
}
Done => panic!("polled after completion"),
}
}
}
}
Q12: What is macro hygiene in Rust? How do macro_rules! and proc macros differ in this regard?
A: Hygiene means identifiers introduced by a macro don't accidentally collide with identifiers at the call site. In macro_rules!, Rust assigns each identifier a "syntax context" — names introduced by the macro live in a different scope than names at the call site. This prevents accidental shadowing. Proc macros (derive, attribute, function-like) operate on TokenStream and are semi-hygienic: they use Span::call_site() (unhygienic, inherits caller's scope) or Span::mixed_site() (hygienic for local vars, unhygienic for items). Span::def_site() (fully hygienic) is nightly-only. Best practice: use call_site for generated type/function names the user needs to reference, and avoid introducing helper variables that might conflict.
Q13: Write a function that takes a closure and calls it, supporting all three Fn traits. Explain the trait hierarchy.
// FnOnce: can be called once (takes self by value)
// FnMut: can be called multiple times (takes &mut self)
// Fn: can be called multiple times (takes &self)
// Fn: FnMut: FnOnce — every Fn is also FnMut and FnOnce
fn call_once<F: FnOnce() -> R, R>(f: F) -> R { f() }
fn call_many<F: Fn() -> R, R>(f: &F, times: usize) -> Vec<R> {
(0..times).map(|_| f()).collect()
}
// The most permissive bound: FnOnce (accepts all closures)
// The most restrictive bound: Fn (only closures that don't mutate or consume)
// Key insight: Box<dyn FnOnce()> was long uncallable because calling
// requires moving out of the Box. Solution: Box<dyn FnOnce()> now has
// a special lang-item call implementation.
Q14: How does Cow<'a, B> work? Give a real-world example where it improves performance.
A: Cow (Clone on Write) wraps either a borrowed reference or an owned value. It defers cloning until mutation is needed:
use std::borrow::Cow;
fn normalize_path(path: &str) -> Cow<'_, str> {
if path.contains("//") {
// Only allocate when we actually need to modify
Cow::Owned(path.replace("//", "/"))
} else {
// Zero-cost: just returns a reference
Cow::Borrowed(path)
}
}
// In a web server processing 1M requests:
// ~95% of paths are clean → Cow::Borrowed (zero allocation)
// ~5% need normalization → Cow::Owned (allocates only when needed)
// vs. always returning String: saves ~950K heap allocations
Q15: What is the "semver trick" and why does it matter for library authors?
A: When a Rust library makes a breaking change (e.g., v1 → v2), downstream crates may depend on both versions transitively. The semver trick: the new version (v2) re-exports key types from the old version (v1) as a dependency, so types are compatible across versions. This lets the ecosystem migrate gradually. Without it, you get "expected foo::Bar v1, found foo::Bar v2" errors even though the types are identical. The cargo-semver-checks tool (now semi-official) detects accidental breaking changes in CI.
Q16: Explain the difference between tokio::spawn and tokio::task::spawn_blocking. What happens if you do CPU-heavy work in an async task?
A: tokio::spawn schedules a future on the async runtime's thread pool. These tasks must be non-blocking — they should spend most time waiting (I/O, timers), yielding the thread at each .await. If you do CPU-heavy work (parsing, crypto, compression), you block the entire runtime thread, starving other tasks. spawn_blocking moves the work to a dedicated blocking thread pool, freeing the async threads. Rule of thumb: if an operation takes >10-100µs without awaiting, use spawn_blocking or rayon for CPU parallelism.
Q17: What is MaybeUninit<T> and why did it replace mem::uninitialized()?
A: mem::uninitialized<T>() was unsound — it created a "value" of type T containing garbage bits. For types like bool or references, any bit pattern other than valid values is instant UB. MaybeUninit<T> is a union that wraps T without requiring initialization. You write to it via ptr::write or as_mut_ptr(), then call .assume_init() only after fully initializing it. This makes the initialization contract explicit and checkable:
use std::mem::MaybeUninit;
let mut arr: [MaybeUninit<String>; 3] = [
MaybeUninit::uninit(),
MaybeUninit::uninit(),
MaybeUninit::uninit(),
];
arr[0].write(String::from("a"));
arr[1].write(String::from("b"));
arr[2].write(String::from("c"));
// SAFETY: all elements initialized above
let arr: [String; 3] = unsafe {
MaybeUninit::array_assume_init(arr)
};
Q18: How do zero-sized types (ZSTs) work? Give examples and explain their usefulness.
A: ZSTs have size_of::<T>() == 0. No memory is allocated for them. Examples: (), PhantomData<T>, struct Marker;. Vec<()> is essentially a counter (its length) with no backing allocation. HashMap<K, ()> becomes a HashSet<K>. ZSTs are used for typestate patterns, marker traits, and const-only types. Function pointers to ZST types compile to no-ops. The compiler optimizes Option<&T> to use null as None (niche optimization), but Option<ZST> is 0 bytes if the ZST has no niche.
Q19: Describe how Arc<Mutex<T>> works internally. What's the performance impact versus message passing?
A: Arc: atomically reference-counted pointer. Uses AtomicUsize for strong/weak counts. Arc::clone() does an atomic increment (~5-20ns). Mutex: OS-backed (pthread_mutex on Linux) or parking-lot based. Lock acquisition in the uncontended case is a single CAS (~10-25ns). Contended: threads park (sleep), context switch cost (~1-10µs). Total Arc<Mutex<T>>::lock(): ~20-50ns uncontended. Message passing (channels): mpsc::channel() or crossbeam::channel — involves allocation, atomic ops, and potential parking. Throughput is similar for single-producer, but channels scale better with many producers because they avoid lock contention. Use shared state for small, frequently-read data. Use channels for work distribution and actor patterns.
Q20: What is object safety? Which traits can be made into trait objects?
A: A trait is object-safe if it can be used as dyn Trait. Requirements: no methods with Self in return position (unless Sized bounded), no generic method parameters, no where Self: Sized bounds on required methods, and must not require Self: Sized. Clone is NOT object-safe because fn clone(&self) -> Self returns Self (size unknown for dyn). Workaround: add where Self: Sized to offending methods or create a companion trait (CloneBox). As of Rust 2024, dyn Trait requires explicit lifetime bounds when the trait has non-'static associated types.
Q21: What is the ?Sized bound and when do you need it?
A: By default, all generic parameters have an implicit T: Sized bound — the compiler needs to know the size at compile time for stack allocation and parameter passing. ?Sized removes this bound, allowing dynamically-sized types (DSTs) like str, [T], and dyn Trait. You need it when writing functions that accept both sized and unsized types, typically behind a reference:
// Only accepts Sized types (default)
fn print_sized<T: Display>(val: &T) { println!("{val}"); }
// Accepts DSTs too — can take &str, &[i32], &dyn Display
fn print_any<T: Display + ?Sized>(val: &T) { println!("{val}"); }
print_any("hello"); // T = str (unsized)
print_any(&42); // T = i32 (sized)
print_any(&vec![1,2] as &dyn Display); // T = dyn Display (unsized)
Q22: Write a thread-safe, lazy-initialized singleton in Rust. Explain the tradeoffs.
use std::sync::OnceLock;
// Modern approach: OnceLock (stable since 1.70)
static CONFIG: OnceLock<AppConfig> = OnceLock::new();
fn get_config() -> &'static AppConfig {
CONFIG.get_or_init(|| {
AppConfig::load_from_file("config.toml")
.expect("failed to load config")
})
}
// LazyLock (stable since 1.80): combines OnceLock + init fn
use std::sync::LazyLock;
static DB_POOL: LazyLock<DbPool> = LazyLock::new(|| {
DbPool::connect("postgres://localhost/mydb")
.expect("failed to connect")
});
// Tradeoffs:
// + Thread-safe, zero-cost after first access (just a pointer load)
// + No external crates needed (was once_cell/lazy_static)
// - Global mutable state makes testing harder
// - Initialization errors are hard to handle (panic in static init)
// - Consider dependency injection instead for testability
Q23: What is #[repr(C)] vs #[repr(Rust)]? When does layout matter?
A: #[repr(Rust)] (default): the compiler may reorder fields to minimize padding. Layout is unspecified and can change between compiler versions. #[repr(C)]: fields are laid out in declaration order with C-compatible padding/alignment rules. Required for FFI structs shared with C code, memory-mapped I/O, network protocols, and when you need deterministic layout (e.g., transmute). Other reprs: #[repr(transparent)] — single-field struct has same layout as the field (for newtype FFI wrappers); #[repr(packed)] — no padding (may cause unaligned reads); #[repr(align(N))] — minimum alignment (useful for cache-line optimization).
Q24: Explain the borrow checker's NLL (Non-Lexical Lifetimes). Give an example where NLL helps.
// Pre-NLL (Rust < 2018): borrows lasted until end of scope
// NLL: borrows end at last use point
fn example(map: &mut HashMap<String, Vec<i32>>, key: &str) {
// With NLL, the immutable borrow of `map` via `get` ends after the match
match map.get(key) {
Some(v) if !v.is_empty() => {
println!("found: {:?}", v);
// NLL: borrow of map from `get` ends here (last use of v)
}
_ => {
// NLL allows this mutable borrow because the immutable one ended
map.insert(key.to_string(), vec![42]);
}
}
}
// Pre-NLL: this would fail because `get`'s borrow lived until `}`
// NLL: compiles fine because the borrow dies at last use of `v`
Q25: How would you implement a lock-free concurrent stack? Outline the approach and pitfalls.
A: Use AtomicPtr for the head pointer. Push: create new node pointing to current head, then CAS head to new node. Pop: read head, CAS head to head.next, return old head's value. Pitfalls: the ABA problem (head changes A→B→A, CAS succeeds incorrectly) — solved with epoch-based reclamation (crossbeam-epoch) or hazard pointers. Memory reclamation: can't immediately free popped nodes (other threads may be reading them). The crossbeam crate provides battle-tested implementations. In Rust, the type system helps: AtomicPtr requires unsafe to dereference, making the danger explicit. Key orderings: Acquire on load, Release on store, AcqRel on CAS.
Q26: What is variance in Rust and why does it matter for unsafe code?
A: Variance determines how subtyping (lifetime relationships) propagates through type constructors. If 'long: 'short: covariant containers allow substituting longer lifetimes (safe for read-only: &'a T, Vec<T>); invariant containers forbid substitution (needed for read-write: &'a mut T, Cell<T>); contravariant reverses the relationship (function arguments). In unsafe code, incorrect variance in a wrapper type can lead to soundness holes. Classic bug: wrapping *mut T in a struct with PhantomData<T> (covariant) instead of PhantomData<&'a mut T> (invariant) when the struct logically has exclusive access to T.
Q27: Explain the Entry API pattern in HashMap. Why is it better than get-then-insert?
use std::collections::HashMap;
let mut scores: HashMap<String, Vec<i32>> = HashMap::new();
// BAD: double lookup + borrow conflict
// if !scores.contains_key("alice") {
// scores.insert("alice".into(), vec![]);
// }
// scores.get_mut("alice").unwrap().push(100);
// GOOD: single lookup via Entry API
scores.entry("alice".into())
.or_insert_with(Vec::new)
.push(100);
// Entry variants:
// .or_insert(default) — insert if empty
// .or_insert_with(|| val) — lazy init
// .or_default() — uses Default::default()
// .and_modify(|v| ...) — modify if exists
// .key() — inspect the key
// Why it's better:
// 1. Single hash computation + single probe
// 2. Avoids borrow checker issues (get borrows immutably, insert needs &mut)
// 3. More concise and idiomatic
Q28: Design a plugin system in Rust. How do you handle dynamic loading vs static dispatch?
A: Two main approaches. Static: define a Plugin trait, compile plugins into the binary. Use inventory or linkme crates for auto-registration. Pro: inlining, no unsafe. Con: requires recompilation. Dynamic: use libloading to load .so/.dll at runtime. Export a C-ABI compatible struct (repr(C)) and a creation function. Pro: hot-reload, third-party plugins. Con: unsafe, ABI fragility, no Rust trait objects across FFI (vtable layout is unstable). Hybrid: define a C-ABI plugin interface with stable types (#[repr(C)] structs, function pointers) and wrap it in a safe Rust trait internally. The abi_stable crate provides tools for stable Rust-to-Rust dynamic linking.
Q29: What is the difference between panic!, abort, and process::exit? When should you use each?
A: panic!: starts unwinding the stack, running destructors (Drop impls) for all local values. Can be caught with catch_unwind. With panic = "abort" in Cargo.toml, panics skip unwinding and abort immediately (smaller binary, faster). std::process::abort(): immediately terminates without running destructors or unwinding. Use for unrecoverable corruption. std::process::exit(code): terminates with an exit code, runs no destructors but does flush stdio. Use for clean CLI exits. Best practice: use Result for expected errors, panic! for programming bugs / invariant violations, abort for security-critical unrecoverable states, and exit for controlled shutdown.
Q30: How do you debug lifetime errors? Walk through your mental model.
A: Step 1: Read the error — Rust's messages now show the conflicting lifetimes and where they originate. Step 2: Draw the lifetime scopes — which value owns the data, which references point to it, when each goes out of scope. Step 3: Check the three elision rules: (a) each param gets its own lifetime, (b) if one input lifetime, it's assigned to output, (c) if &self/&mut self is a param, its lifetime is assigned to output. Step 4: If elision doesn't cover your case, add explicit lifetimes. Name them meaningfully: 'input, 'db, 'query. Step 5: If fighting the borrow checker, consider restructuring: clone (cheap for small data), split borrows (borrow different struct fields), use indices instead of references, or introduce a new scope with a block { }. Step 6: cargo clippy often suggests simpler alternatives. Step 7: As last resort, unsafe with a clear safety comment — but usually there's a safe design.
Q31: Implement a basic arena allocator in Rust. Explain why it's useful.
use std::cell::Cell;
struct Arena {
buf: Vec<u8>,
offset: Cell<usize>,
}
impl Arena {
fn new(capacity: usize) -> Self {
Arena {
buf: vec![0u8; capacity],
offset: Cell::new(0),
}
}
fn alloc<T>(&self, val: T) -> &mut T {
let align = std::mem::align_of::<T>();
let size = std::mem::size_of::<T>();
let mut off = self.offset.get();
// Align the offset
off = (off + align - 1) & !(align - 1);
assert!(off + size <= self.buf.len(), "arena out of memory");
let ptr = unsafe { self.buf.as_ptr().add(off) as *mut T };
unsafe { ptr.write(val); }
self.offset.set(off + size);
unsafe { &mut *ptr }
}
}
// Usage: all allocations freed at once when Arena drops
let arena = Arena::new(4096);
let x = arena.alloc(42i32);
let s = arena.alloc(String::from("hello"));
// No individual frees — entire arena freed on drop
// Why arenas? Batch deallocation (compilers, ECS, game frames),
// cache-friendly (contiguous memory), faster than individual heap allocs,
// natural fit for phase-based workloads. Production crate: bumpalo.
Q32: What are the tradeoffs between async/await and OS threads for concurrency?
A:
| Dimension | Async/Await | OS Threads |
|---|---|---|
| Memory per task | ~few hundred bytes (state machine) | ~2-8 MB (stack) |
| Context switch | ~10-50ns (cooperative, in userspace) | ~1-10µs (kernel, TLB flush) |
| Scalability | Millions of tasks on few threads | Thousands of threads max |
| CPU-bound work | Poor (blocks the runtime) | Good (preemptive scheduling) |
| Debugging | Harder (state machines, Pin) | Easier (stack traces) |
| Ecosystem | tokio, async-std (incompatible runtimes) | Standard library only |
| Cancellation | Drop the future (cooperative) | No safe cancellation (must signal) |
| Best for | I/O-bound (servers, proxies, crawlers) | CPU-bound or simple parallelism |
Q33: What is Deref coercion and how can it be both powerful and confusing?
A: When a type implements Deref<Target = U>, Rust will automatically convert &T to &U where needed. This is why &String works where &str is expected, and &Vec<T> works where &[T] is expected. The chain can be multi-step: &Box<String> → &String → &str. Power: makes smart pointers ergonomic, enables the "smart pointer" pattern. Confusion: method resolution can be surprising (calling .len() on &Box<Vec<u8>> auto-derefs through two levels), and implementing Deref for non-smart-pointer types is an antipattern (it breaks expectations about method resolution).
24. Quick Revision — The Rust Book
The Three Ownership Rules
1. Each value in Rust has exactly ONE owner. 2. There can only be one owner at a time. 3. When the owner goes out of scope, the value is dropped.
Stack vs Heap — When It Matters
| Aspect | Stack | Heap |
|---|---|---|
| Size | Known at compile time, fixed | Dynamic, determined at runtime |
| Speed | Fast (push/pop, LIFO) | Slower (allocator searches for space) |
| Access | Direct | Follows a pointer |
| Examples | i32, f64, bool, char, fixed arrays, tuples of Copy types | String, Vec<T>, Box<T>, HashMap |
Move vs Copy vs Clone
let s1 = String::from("hello");
let s2 = s1; // MOVE — s1 is now invalid
// println!("{s1}"); // ERROR: value moved
let s3 = s2.clone(); // CLONE — deep copy, both valid
println!("{s2} {s3}"); // OK
let x = 42;
let y = x; // COPY — x is still valid (i32 implements Copy)
println!("{x} {y}"); // OK
bool, char, f32/f64, tuples of Copy types, &T (shared refs). A type CANNOT implement Copy if it implements Drop.
The Two Borrowing Rules
At any given time, you can have EITHER: • ONE mutable reference (&mut T) • ANY number of immutable references (&T) — but NEVER both simultaneously. All references must always be valid (no dangling).
Borrowing Gotchas & NLL
let mut s = String::from("hello");
let r1 = &s; // OK — immutable borrow
let r2 = &s; // OK — multiple immutable borrows
println!("{r1} {r2}");
// r1 and r2 are no longer used after this point (NLL)
let r3 = &mut s; // OK — NLL: immutable borrows ended at last use
r3.push_str(" world");
// Dangling reference prevention:
// fn dangle() -> &String {
// let s = String::from("hello");
// &s // ERROR: s dropped at end of function
// }
// Fix: return the owned String instead
Lifetime Elision Rules
The compiler applies these rules IN ORDER to infer lifetimes: Rule 1: Each reference parameter gets its own lifetime. fn f(a: &str, b: &str) → fn f<'a, 'b>(a: &'a str, b: &'b str) Rule 2: If exactly ONE input lifetime, it's assigned to ALL outputs. fn f(s: &str) -> &str → fn f<'a>(s: &'a str) -> &'a str Rule 3: If &self or &mut self is a param, self's lifetime → all outputs. fn method(&self, s: &str) -> &str → output gets self's lifetime If rules don't resolve all lifetimes → you must annotate explicitly.
Structs with Lifetimes
// A struct holding a reference MUST declare the lifetime
struct ImportantExcerpt<'a> {
part: &'a str, // part must live at least as long as 'a
}
let novel = String::from("Call me Ishmael. Some years ago...");
let first = novel.split('.').next().unwrap();
let excerpt = ImportantExcerpt { part: first };
// excerpt cannot outlive novel (which owns the str data)
Traits — Core Patterns
// Define a trait
trait Summary {
fn summarize(&self) -> String;
// Default implementation (can be overridden)
fn preview(&self) -> String {
format!("Read more: {}", self.summarize())
}
}
// Implement on a type
impl Summary for Article {
fn summarize(&self) -> String { self.title.clone() }
}
// Trait bounds — three equivalent syntaxes:
fn notify(item: &impl Summary) { } // sugar
fn notify<T: Summary>(item: &T) { } // bound
fn notify<T>(item: &T) where T: Summary { } // where clause
// Multiple bounds
fn display_summary(item: &(impl Summary + Display)) { }
// Return impl Trait (one concrete type only)
fn make_summary() -> impl Summary { Article { /* ... */ } }
// Blanket implementation
impl<T: Display> ToString for T { /* std does this */ }
Coherence & Orphan Rule
You can impl a trait for a type ONLY IF:
• The trait is local to your crate, OR
• The type is local to your crate.
// OK: local trait + foreign type
trait MyTrait {}
impl MyTrait for Vec<i32> {}
// OK: foreign trait + local type
struct MyType;
impl Display for MyType { /* ... */ }
// ERROR: foreign trait + foreign type
// impl Display for Vec<i32> {}
// Workaround: newtype pattern
struct Wrapper(Vec<i32>);
impl Display for Wrapper { /* ... */ }
Pattern Matching — Complete Rules
match value {
Pattern => expression, // literal, struct, enum, tuple
Pattern if guard => expr, // match guard
x @ 1..=5 => use(x), // bind + range
(a, b, ..) => use(a, b), // ignore rest with ..
Some(ref x) => borrow(x), // borrow instead of move
_ => default, // catch-all (must be last)
}
// Exhaustiveness: ALL variants must be handled. Compiler enforces this.
// Non-exhaustive: _ or named catch-all handles remaining cases.
// if let — when you only care about one variant:
if let Some(val) = option { use(val); }
// while let — loop while pattern matches:
while let Some(top) = stack.pop() { process(top); }
// let-else (stable since 1.65):
let Some(val) = option else { return Err("missing"); };
Error Handling Decision Tree
Can it fail?
/ \
No Yes
| |
just return Is it recoverable?
/ \
No Yes
| |
panic!() Result<T, E>
.unwrap() |
.expect() Propagate with ?
or handle with match
// The ? operator — early return on Err:
fn read_file() -> Result<String, io::Error> {
let mut s = String::new();
File::open("data.txt")?.read_to_string(&mut s)?;
Ok(s)
}
// ? on Option — early return on None:
fn first_char(text: &str) -> Option<char> {
text.lines().next()?.chars().next()
}
// ? in main():
fn main() -> Result<(), Box<dyn Error>> {
let f = File::open("config.toml")?;
Ok(())
}
Smart Pointer Decision Tree
| Need | Use | Notes |
|---|---|---|
| Heap allocation, single owner | Box<T> | Compile-time checked, zero overhead over raw heap |
| Multiple owners, immutable | Rc<T> | Reference counted, single-threaded only |
| Multiple owners + mutation | Rc<RefCell<T>> | Runtime borrow check, panics on violation |
| Multiple owners, threaded | Arc<T> | Atomic ref count, thread-safe |
| Multiple owners + mutation, threaded | Arc<Mutex<T>> | Locks for thread-safe interior mutability |
| Interior mutability (Copy types) | Cell<T> | Get/set, no borrow tracking overhead |
Closures — Capture & Trait Hierarchy
// Capture modes (compiler picks the least restrictive):
let x = String::from("hi");
let borrow_closure = || println!("{x}"); // borrows &x → Fn
let mut v = vec![1,2];
let mut_closure = || v.push(3); // borrows &mut v → FnMut
let consume_closure = move || drop(x); // moves x → FnOnce
// Trait hierarchy: Fn ⊂ FnMut ⊂ FnOnce
// Every Fn also implements FnMut and FnOnce.
// Choose the most permissive bound your API needs:
// FnOnce — accepts all closures (most flexible for caller)
// Fn — callable many times through &self (most restrictive)
Concurrency — Thread Safety Quick Ref
// Spawn a thread
let handle = thread::spawn(move || {
// `move` forces capture by value (required for thread safety)
expensive_work()
});
let result = handle.join().unwrap();
// Mutex — mutual exclusion
let data = Arc::new(Mutex::new(0));
let data_clone = Arc::clone(&data);
thread::spawn(move || {
let mut num = data_clone.lock().unwrap();
*num += 1;
// MutexGuard dropped here → lock released automatically
});
// Channel — message passing
let (tx, rx) = mpsc::channel();
thread::spawn(move || { tx.send(42).unwrap(); });
let val = rx.recv().unwrap(); // blocks until message arrives
Send & Sync — Thread Safety Markers
| Trait | Meaning | Implements | Does NOT |
|---|---|---|---|
Send | Ownership transferable across threads | Most types, Arc<T>, Mutex<T> | Rc<T>, raw pointers |
Sync | &T is safe to share across threads | i32, Arc<T>, Mutex<T> | Cell<T>, RefCell<T>, Rc<T> |
// Rule: T is Sync if &T is Send. // Both are auto-traits — compiler derives them from field types. // Manual unsafe impl only when wrapping raw pointers with proper sync.
Async / Await — Mental Model
// async fn returns a Future (lazy — does nothing until polled)
async fn fetch_data(url: &str) -> Result<String, Error> {
let response = reqwest::get(url).await?; // yield point
let body = response.text().await?; // yield point
Ok(body)
}
// .await suspends the current task, yields to the runtime.
// The runtime polls the Future again when I/O is ready.
// Futures are state machines — each .await = a state transition.
// Key distinction:
// Concurrency ≠ Parallelism
// Concurrent: one thread interleaves multiple tasks (async)
// Parallel: multiple threads execute simultaneously (rayon, threads)
// CPU-bound work in async context:
// BAD: heavy_compute() inside async task → blocks the runtime thread
// GOOD: tokio::task::spawn_blocking(|| heavy_compute())
Unsafe Rust — The Five Superpowers
unsafe {
// 1. Dereference a raw pointer
let ptr = &val as *const i32;
let val = *ptr;
// 2. Call an unsafe function
dangerous_function();
// 3. Access or modify a mutable static variable
COUNTER += 1;
// 4. Implement an unsafe trait
// unsafe impl Send for MyType {}
// 5. Access fields of a union
// let f = my_union.field1;
}
// unsafe does NOT turn off the borrow checker!
// It only unlocks these 5 capabilities.
// Minimize unsafe scope. Document safety invariants.
// Validate with: cargo +nightly miri test
Advanced Traits — Quick Reference
// Associated types (one concrete type per impl, unlike generics)
trait Iterator {
type Item; // associated type
fn next(&mut self) -> Option<Self::Item>;
}
// Default type parameters
trait Add<Rhs = Self> { // Rhs defaults to Self
type Output;
fn add(self, rhs: Rhs) -> Self::Output;
}
// Operator overloading
impl Add for Point {
type Output = Point;
fn add(self, other: Point) -> Point {
Point { x: self.x + other.x, y: self.y + other.y }
}
}
// Fully qualified syntax (disambiguation)
<Type as Trait>::method(&instance);
// Supertraits (trait requires another trait)
trait PrettyPrint: Display {
fn pretty(&self) -> String;
}
// Any type implementing PrettyPrint must also implement Display.
Common Conversions Cheat Sheet
| From | To | Method |
|---|---|---|
&str | String | .to_string(), String::from(s), .to_owned() |
String | &str | &s, s.as_str() |
&str | i32 | s.parse::<i32>()? |
i32 | String | n.to_string() |
Vec<T> | &[T] | &v, v.as_slice() |
&[T] | Vec<T> | s.to_vec() |
Option<T> | Result<T,E> | opt.ok_or(err) |
Result<T,E> | Option<T> | res.ok() |
Box<dyn Error> | concrete | err.downcast::<MyError>() |
Iterator Essentials
// Adapters (lazy — produce new iterators): .map(|x| x * 2) .filter(|x| x > &5) .enumerate() .zip(other) .chain(other) .take(n) .skip(n) .flatten() .peekable() .inspect(|x| dbg!(x)) // Consumers (eager — drive the iterator): .collect::<Vec<_>>() .sum::<i32>() .count() .any(|x| cond) .all(|x| cond) .find(|x| cond) .fold(init, |acc, x| acc + x) .for_each(|x| use(x)) // Performance: iterators compile to the same code as hand-written loops // (zero-cost abstraction via monomorphization).
String Types Decision Guide
| Type | Owned? | Encoding | Use For |
|---|---|---|---|
String | Yes | UTF-8 | General text manipulation |
&str | No (slice) | UTF-8 | Function parameters, string literals |
OsString / &OsStr | Yes / No | OS-native | File names, env vars (may not be valid UTF-8) |
CString / &CStr | Yes / No | Null-terminated | FFI with C libraries |
PathBuf / &Path | Yes / No | OS-native | Filesystem paths (wraps OsStr) |
Cargo & Tooling Essentials
cargo new my_project # create binary project cargo new --lib my_lib # create library project cargo build --release # optimized build cargo test # run all tests cargo test test_name # run specific test cargo clippy # lint for common mistakes cargo fmt # format code cargo doc --open # generate & open docs cargo bench # run benchmarks cargo +nightly miri test # detect undefined behavior cargo tree # dependency graph cargo update # update deps within semver cargo audit # check for security advisories
Testing Patterns
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn basic_test() {
assert_eq!(add(2, 3), 5);
}
#[test]
fn with_message() {
assert!(result.is_ok(), "Expected Ok, got: {:?}", result);
}
#[test]
#[should_panic(expected = "out of bounds")]
fn test_panic() {
let v = vec![1, 2, 3];
v[99];
}
#[test]
fn result_test() -> Result<(), String> {
let val = parse("42").map_err(|e| e.to_string())?;
assert_eq!(val, 42);
Ok(())
}
#[test]
#[ignore] // skipped unless: cargo test -- --ignored
fn expensive_test() { /* ... */ }
}
// Integration tests: tests/ directory (separate compilation unit)
// Doc tests: code in /// comments is compiled and tested
25. Trait Objects & Dynamic Dispatch
Fat Pointers & VTable Layout
A trait object (&dyn Trait) is a fat pointer: 16 bytes on 64-bit systems (2 × 8-byte pointers).
// Structure of &dyn Trait:
// [data_ptr: *const T] [vtable_ptr: *const VTable]
//
// VTable contains function pointers for all trait methods:
// struct VTable {
// drop_fn: unsafe fn(*mut T),
// size: usize,
// align: usize,
// method_1: fn(*const T, ...) -> ReturnType,
// method_2: fn(*const T, ...) -> ReturnType,
// ...
// }
trait Animal {
fn speak(&self) -> String;
fn move_around(&mut self);
}
struct Dog;
impl Animal for Dog {
fn speak(&self) -> String { "Woof!".to_string() }
fn move_around(&mut self) {}
}
struct Cat;
impl Animal for Cat {
fn speak(&self) -> String { "Meow!".to_string() }
fn move_around(&mut self) {}
}
// Creating trait objects
let dog: Box<dyn Animal> = Box::new(Dog);
let cat: Box<dyn Animal> = Box::new(Cat);
// Both point to different data with different vtables
let animals: Vec<Box<dyn Animal>> = vec![
Box::new(Dog),
Box::new(Cat),
];
// Each Box contains: [heap_ptr to Dog/Cat] [vtable_ptr]
// When you call dog.speak(), the vtable is dereferenced to find the right function
&dyn Trait vs Box<dyn Trait>
// &dyn Trait — borrowed trait object (fat reference)
// Stores: [data_ptr: &T] [vtable_ptr: *const VTable]
fn print_animal(animal: &dyn Animal) {
println!("{}", animal.speak());
}
// Box<dyn Trait> — owned trait object
// Stores: [data_ptr: *mut T] [vtable_ptr: *const VTable]
// Owns the data; drops it when Box is dropped
fn process(mut animal: Box<dyn Animal>) {
animal.move_around();
// animal is dropped here
}
// Other pointers to trait objects:
// Arc<dyn Trait> — thread-safe shared ownership
let shared: Arc<dyn Animal> = Arc::new(Dog);
// Rc<dyn Trait> — single-threaded shared ownership
let rc: Rc<dyn Animal> = Rc::new(Cat);
// Pattern: store trait objects in Vec
let mut menagerie: Vec<Box<dyn Animal>> = vec![];
menagerie.push(Box::new(Dog));
menagerie.push(Box::new(Cat));
for animal in &menagerie {
println!("{}", animal.speak());
}
// Two different speak() implementations called via vtable dispatch
Object Safety Rules
Only traits with object-safe methods can become trait objects. A method is object-safe if:
| Requirement | Reason | Example |
|---|---|---|
No Self type (except &self, &mut self) | Vtable doesn't know the concrete type | fn clone(&self) -> Self is NOT safe |
| No generic type parameters | VTable must be monomorphic | fn foo<T>(&self, x: T) is NOT safe |
| No const generics | VTable is fixed-size | fn bar<const N: usize>(&self) is NOT safe |
No Self in return type | Caller expects concrete type, not trait object | fn get_self(&self) -> Self is NOT safe |
| Supertraits must be object-safe | VTable must implement all supertrait methods | If TraitA: TraitB, both must be object-safe |
// NOT object-safe — Self in return type
trait Cloneable {
fn clone(&self) -> Self; // ERROR: can't be trait object
}
// SAFE — returns concrete type
trait Drawable {
fn draw(&self); // OK
}
impl Drawable for Circle { fn draw(&self) {} }
let shape: Box<dyn Drawable> = Box::new(Circle);
// NOT object-safe — generic type parameter
trait Processor {
fn process<T>(&self, input: T); // ERROR
}
// SAFE — use associated type instead
trait Processor2 {
type Input;
fn process(&self, input: Self::Input);
}
// NOT object-safe — const generic
trait Batched {
fn batch<const N: usize>(&self, items: [u8; N]); // ERROR
}
// Checking object safety at compile time
const _: () = {
fn assert_object_safe(_: &dyn Drawable) {}
};
Static vs Dynamic Dispatch Performance
| Aspect | Static Dispatch (impl Trait) | Dynamic Dispatch (dyn Trait) |
|---|---|---|
| Code generation | Monomorphized (one version per type) | Single code; vtable lookup at runtime |
| Call overhead | ~0 cycles (inlined, branch-predicted) | ~1-3 cycles (vtable indirection + cache miss) |
| Binary size | Larger (code duplication for each type) | Smaller |
| Optimization | Aggressive inlining, devirtualization | Limited; vtable prevents inlining |
| Cache locality | Good (code specialized for type) | Good (vtable typically cached) |
| Heterogeneous collections | Must use trait objects anyway | Natural; one Vec<Box<dyn T>> |
// Static dispatch — compiler generates specialized code
fn process_static<T: Animal>(animal: &T) {
println!("{}", animal.speak());
}
// Compiler generates:
// process_static<Dog>, process_static<Cat>, etc.
// Dynamic dispatch — single code path, runtime lookup
fn process_dynamic(animal: &dyn Animal) {
println!("{}", animal.speak());
}
// One function; vtable lookup for the actual speak() implementation
// Benchmark: static vs dynamic
use std::time::Instant;
fn bench_static() {
let animals: Vec<Box<dyn std::any::Any>> = vec![
Box::new(Dog),
Box::new(Cat),
];
let start = Instant::now();
for _ in 0..1_000_000 {
for animal in &animals {
// Would need type checking here; use process_static in real code
}
}
println!("Static: {:?}", start.elapsed());
}
fn bench_dynamic() {
let animals: Vec<Box<dyn Animal>> = vec![
Box::new(Dog),
Box::new(Cat),
];
let start = Instant::now();
for _ in 0..1_000_000 {
for animal in &animals {
let _ = animal.speak(); // vtable dispatch
}
}
println!("Dynamic: {:?}", start.elapsed());
}
// Static typically 2-5x faster in tight loops
// Dynamic often fast enough if not in innermost loop
When to Use Each
impl Trait (static dispatch) when:
- You know all concrete types at compile time
- The function is in a hot path or called millions of times
- You want to avoid heap allocation
- The trait has methods that can't be object-safe (like
clone())
dyn Trait (dynamic dispatch) when:
- You have heterogeneous collections (Vec of different types)
- Dispatch overhead is negligible compared to work done per call
- You're building a plugin system or interpreter
- You need to return different concrete types from branches
- You want smaller binary size
impl Trait vs dyn Trait Decision Tree
// Decision tree examples:
// 1. Return different types from branches
fn factory(kind: &str) -> Box<dyn Animal> {
match kind {
"dog" => Box::new(Dog),
"cat" => Box::new(Cat),
_ => Box::new(Dog),
}
}
// 2. Store multiple types in one collection
fn collect_animals(animals: Vec<Box<dyn Animal>>) {
for animal in animals {
println!("{}", animal.speak());
}
}
// 3. Use generic if you need static dispatch
fn single_animal<T: Animal>(animal: &T) {
println!("{}", animal.speak());
}
// Compiler generates monomorphic versions
Downcasting with Any
use std::any::Any;
trait Animal: Any {
fn speak(&self) -> String;
}
struct Dog {
breed: String,
}
impl Animal for Dog {
fn speak(&self) -> String { "Woof!".to_string() }
}
impl Dog {
fn fetch(&self) {
println!("Fetching as a {}", self.breed);
}
}
fn interact(animal: &dyn Animal) {
println!("{}", animal.speak());
// Downcast to Dog using type_id
if let Some(dog) = (animal as &dyn Any).downcast_ref::<Dog>() {
dog.fetch();
}
}
// Usage
let dog = Dog { breed: "Labrador".to_string() };
interact(&dog);
// Downcasting Box<dyn Trait>
fn extract_dog(animal: Box<dyn Animal>) -> Option<Dog> {
(animal as Box<dyn Any>)
.downcast::<Dog>()
.ok()
.map(|boxed| *boxed)
}
// Mutable downcasting
fn train(animal: &mut dyn Animal) {
if let Some(dog) = (animal as &mut dyn Any).downcast_mut::<Dog>() {
dog.trained = true;
}
}
26. Pin, Unpin & Self-Referential Types
Why Pin Exists
Self-referential structs and async await contexts require guaranteed memory stability. Pin ensures a value is never moved in memory after construction.
// Problem: Self-referential struct (NOT possible without Pin/unsafe)
struct SelfRef {
value: String,
// ptr_to_value: *const String, // Would become dangling if moved!
}
// If you move this struct, the pointer becomes dangling.
// Solution: Async/await creates self-referential state machines
async fn example() {
let s = "hello".to_string();
some_future().await; // May suspend; s must stay in same memory
println!("{}", s);
}
// Generated state machine:
enum State {
Start,
WaitingForFuture {
s: String,
// Hidden: reference to `s` while suspended
},
}
// Pin guarantees:
// - Value cannot be moved once pinned
// - Safe to create self-references if Drop is custom (unsafe code)
// - Required for async state machine guarantees
Pin<&T> Variants
| Type | Meaning | Usage |
|---|---|---|
Pin<&T> | T is pinned & borrowed immutably | Read-only access; can't move T |
Pin<&mut T> | T is pinned & borrowed mutably | Exclusive mutable access; can't move T |
Pin<Box<T>> | T is pinned & heap-owned | Owned pinned value; stable address |
Pin<Rc<T>> | T is pinned & reference-counted | Shared pinned value (single-threaded) |
Pin<Arc<T>> | T is pinned & atomically reference-counted | Shared pinned value (thread-safe) |
use std::pin::Pin;
use std::marker::PhantomPinned;
// Basic Pin usage
let mut value = 5;
let pinned = Pin::new(&mut value);
// value cannot be moved; pinned.deref_mut() must be used for mutable access
// Pin with Box (heap-allocated, stable address)
let boxed = Box::new(String::from("hello"));
let pinned_box: Pin<Box<String>> = Pin::new(boxed);
// String is on heap, will never move
// Self-referential struct with PhantomPinned
struct Node {
data: i32,
next: Option<Pin<Box<Node>>>,
// PhantomPinned prevents impl of Unpin
_pin: PhantomPinned,
}
// Can only create self-ref with manual Pin construction
let mut node = Box::pin(Node {
data: 42,
next: None,
_pin: PhantomPinned,
});
// Now node.next can safely hold a Pin to self once initialized
Unpin Marker Trait
Most types implement Unpin (marker trait, auto-derived). A type is Unpin if it's safe to move it after pinning. Only types with self-references or custom drop logic need !Unpin.
use std::marker::{PhantomPinned, Unpin};
// Most types are Unpin
struct Normal {
x: i32,
y: String,
}
// impl Unpin for Normal {} // Auto-derived; can be moved safely
// Types that prevent Unpin
struct NotUnpin {
value: String,
_pin: PhantomPinned, // Makes !Unpin
}
// NOT impl Unpin for NotUnpin
// Only Unpin types can be moved out of Pin
fn move_if_unpin<T: Unpin>(pinned: Pin<&mut T>) -> T {
// Can only work for Unpin types
*Pin::into_inner(pinned)
}
// Non-Unpin types can't be moved, but can be mutated in place
fn mutate_pinned<T: ?Sized>(pinned: Pin<&mut T>) {
// Can access with Pin::get_unchecked_mut only if you promise safety
let r = unsafe { Pin::get_unchecked_mut(pinned) };
// Now you have &mut T but T hasn't moved!
}
// Custom impl !Unpin
struct CustomNotUnpin {
data: Vec<i32>,
}
impl !Unpin for CustomNotUnpin {} // nightly feature
pin_project Macro
use pin_project::pin_project;
use std::pin::Pin;
// Derives Pin projection for fields
#[pin_project]
struct MyStruct {
#[pin]
pinned_field: String,
regular_field: i32,
}
impl MyStruct {
fn do_something(self: Pin<&mut Self>) {
let this = self.project();
// this.pinned_field is Pin<&mut String>
let pinned_str: Pin<&mut String> = this.pinned_field;
// this.regular_field is &mut i32
let normal: &mut i32 = this.regular_field;
}
}
// Use in streams/futures
#[pin_project]
struct BufferedReader<R> {
#[pin]
reader: R,
buffer: Vec<u8>,
}
// pin_project generates safe projections automatically
// Prevents accidental moving of pinned fields
Manual Pin Implementation
use std::pin::Pin;
use std::marker::PhantomPinned;
use std::ops::{Deref, DerefMut};
// Manually implement Pin safety for a custom type
struct SelfRefStruct {
name: String,
name_ptr: *const String, // Points to self.name
_pin: PhantomPinned,
}
impl SelfRefStruct {
fn new(name: String) -> Pin<Box<Self>> {
let mut boxed = Box::new(SelfRefStruct {
name,
name_ptr: std::ptr::null(),
_pin: PhantomPinned,
});
let name_ptr = &boxed.name as *const String;
unsafe {
boxed.name_ptr = name_ptr;
}
Pin::new(boxed)
}
fn get_name(&self) -> &str {
// Safe: name_ptr is valid because struct is pinned
unsafe { &*self.name_ptr }
}
}
// Custom Drop for pinned types
impl Drop for SelfRefStruct {
fn drop(&mut self) {
// Safe to access self because Pin guarantees stability
let _ = self.get_name();
}
}
// Usage
fn main() {
let pinned = SelfRefStruct::new("hello".to_string());
println!("{}", pinned.get_name());
}
Structural Pinning Rules
use std::pin::Pin;
// Structural pinning: projection respects Pin semantics
struct Structural {
field: String,
}
impl Structural {
// Safe projection: field doesn't have internal references
fn project(self: Pin<&mut Self>) -> Pin<&mut String> {
unsafe { Pin::new_unchecked(&mut self.get_unchecked_mut().field) }
}
}
// Intrinsic pinning: type requires Pin regardless
struct Intrinsic {
_pin: std::marker::PhantomPinned,
}
// Non-structural (WRONG): projecting a self-referential field
struct BadStructural {
data: String,
ptr: *const String, // Points to data!
}
impl BadStructural {
fn project_bad(self: Pin<&mut Self>) -> Pin<&mut String> {
// WRONG: moving data would invalidate ptr!
// Never do this without deep understanding
unsafe { Pin::new_unchecked(&mut self.get_unchecked_mut().data) }
}
}
// Rule: only project fields that don't reference other struct fields
Pin in async/await Context
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
// Futures must be pinned before polling
async fn my_async() -> i32 {
42
}
fn use_future() {
let future = my_async();
let mut boxed = Box::pin(future);
// Can poll a pinned future
// In real code, the executor does this
}
// Manual Future impl requires Pin
struct MyFuture {
state: u32,
}
impl Future for MyFuture {
type Output = i32;
fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<i32> {
// self: Pin<&mut Self> guarantees stable memory
// Can create self-references safely during suspension
Poll::Ready(42)
}
}
// Async state machine created by async/await:
enum AsyncState {
Start,
Waiting {
variable: String,
// reference: &String, // Safe because state pinned
},
Done,
}
// Each .await point creates a state that can hold references
// Pin guarantees references remain valid across await
27. FFI (Foreign Function Interface)
Calling C from Rust
// Declaring C functions
extern "C" {
fn printf(format: *const u8, ...) -> i32;
fn strlen(s: *const u8) -> usize;
fn malloc(size: usize) -> *mut u8;
fn free(ptr: *mut u8);
}
// Safe wrapper
unsafe fn print_c_string(s: *const u8) {
printf(b"String: %s\n\0".as_ptr() as *const u8, s);
}
// Calling C function
fn main() {
unsafe {
let c_str = b"Hello from Rust\0";
print_c_string(c_str.as_ptr());
}
}
// Better: use std::ffi for C string conversion
use std::ffi::{CStr, CString};
fn safe_strlen(s: &CStr) -> usize {
unsafe { strlen(s.as_ptr() as *const u8) }
}
fn main() {
let rust_str = "hello";
let c_str = CString::new(rust_str).unwrap();
let len = safe_strlen(&c_str);
println!("Length: {}", len);
}
Calling Rust from C
// Rust code that C will call
#[no_mangle]
pub extern "C" fn add(a: i32, b: i32) -> i32 {
a + b
}
#[no_mangle]
pub extern "C" fn process_string(s: *const u8) -> i32 {
if s.is_null() {
return -1;
}
unsafe {
let len = libc::strlen(s);
len as i32
}
}
// In C:
// extern int add(int a, int b);
// int result = add(5, 3); // 8
//
// extern int process_string(const char *s);
// int len = process_string("hello"); // 5
// Build as a library
// Cargo.toml:
// [lib]
// crate-type = ["cdylib"] // C-compatible dynamic library
// or
// crate-type = ["staticlib"] // Static library
extern "C" & #[no_mangle]
| Concept | Purpose | Example |
|---|---|---|
extern "C" | Declares function uses C calling convention | extern "C" fn foo() {} |
#[no_mangle] | Prevents Rust name mangling; exposes raw symbol name | #[no_mangle] pub extern "C" fn foo() {} |
| Name mangling | Rust encodes function signature in symbol name | _ZN3foo3bar17h1234567890abcdefE |
| C name stability | C linker expects exact symbol names | C expects foo, not mangled name |
// Without #[no_mangle], Rust mangles the name
pub extern "C" fn process() {}
// Mangled: _ZN7example7processE (or similar)
// C can't find "process"
// With #[no_mangle], name is exposed as-is
#[no_mangle]
pub extern "C" fn process() {}
// Symbol: process
// C can now link and call it
// Multiple ABIs
extern "C" { fn c_func(); } // C calling convention
extern "cdecl" { fn cdecl_func(); } // C declaration (Windows)
extern "stdcall" { fn stdcall_func(); } // Windows API
extern "fastcall" { fn fast_func(); } // Fast call (Windows)
// Using #[no_mangle] for exported Rust functions
#[no_mangle]
pub extern "C" fn rust_malloc(size: usize) -> *mut u8 {
let vec: Vec<u8> = vec![0; size];
Box::into_raw(vec.into_boxed_slice()) as *mut u8
}
#[no_mangle]
pub extern "C" fn rust_free(ptr: *mut u8, size: usize) {
if !ptr.is_null() {
unsafe {
let _ = Vec::from_raw_parts(ptr, 0, size);
}
}
}
CString & CStr — C String Conversion
use std::ffi::{CStr, CString};
use std::os::raw::c_char;
// CString: owned C string (null-terminated)
let rust_str = "Hello, World!";
let c_string = CString::new(rust_str).unwrap();
// Automatically adds null terminator
// Get pointer to pass to C
let ptr: *const c_char = c_string.as_ptr();
// CStr: borrowed C string (immutable)
unsafe fn from_c_str(ptr: *const c_char) -> &'static str {
let c_str = CStr::from_ptr(ptr);
c_str.to_str().unwrap()
}
// Roundtrip conversion
fn main() {
let original = "rust string";
let c_string = CString::new(original).unwrap();
let c_str = CStr::from_bytes_with_nul(b"rust string\0").unwrap();
assert_eq!(original, c_str.to_str().unwrap());
}
// Handling embedded nulls
let with_null = "string\0with\0nulls";
match CString::new(with_null) {
Ok(_) => println!("No embedded nulls"),
Err(e) => println!("Contains null: {}", e),
}
// Safe wrapper for C function taking &CStr
extern "C" {
fn c_strlen(s: *const c_char) -> usize;
}
fn safe_strlen(s: &str) -> usize {
let c_str = CString::new(s).unwrap();
unsafe { c_strlen(c_str.as_ptr()) }
}
Raw Pointers in FFI
use std::ptr;
// Creating null pointers
let null: *const i32 = ptr::null();
let null_mut: *mut i32 = ptr::null_mut();
// Dereferencing raw pointers (unsafe)
let x = 42;
let ptr: *const i32 = &x;
unsafe {
println!("{}", *ptr); // 42
}
// Pointer arithmetic
fn buffer_access(ptr: *const i32, index: usize) -> i32 {
unsafe { *ptr.add(index) }
}
// Pointer casts
let ptr: *const u8 = b"hello".as_ptr();
let as_int = ptr as *const i32; // Reinterpret as i32
// Offset calculations
fn struct_field_offset() {
#[repr(C)]
struct MyStruct {
a: i32,
b: i64,
}
let ptr = &MyStruct { a: 1, b: 2 } as *const MyStruct;
unsafe {
// Get address of field 'b'
let b_ptr = &(*ptr).b as *const i64;
// or using offset:
let b_offset = ptr.cast::<i64>().offset(1);
}
}
// Volatile reads/writes (for memory-mapped I/O)
unsafe fn read_volatile_register(addr: *const u32) -> u32 {
ptr::read_volatile(addr)
}
unsafe fn write_volatile_register(addr: *mut u32, value: u32) {
ptr::write_volatile(addr, value);
}
bindgen & cbindgen Tools
// bindgen — auto-generate Rust bindings from C headers
// Cargo.toml:
// [build-dependencies]
// bindgen = "0.68"
// build.rs:
use bindgen;
fn main() {
let bindings = bindgen::Builder::default()
.header("wrapper.h")
.generate()
.expect("Unable to generate bindings");
bindings.write_to_file("src/bindings.rs")
.expect("Couldn't write bindings!");
}
// Generates Rust code like:
// extern "C" {
// pub fn c_function(a: i32, b: i32) -> i32;
// }
// cbindgen — auto-generate C bindings from Rust code
// Cargo.toml:
// [package.metadata.cbindgen]
// language = "C"
// [lib]
// crate_type = ["cdylib", "staticlib"]
// rust_lib.rs:
#[no_mangle]
pub extern "C" fn add(a: i32, b: i32) -> i32 {
a + b
}
#[no_mangle]
pub extern "C" fn process(ptr: *mut i32) {
unsafe { *ptr += 1; }
}
// cbindgen generates:
// int32_t add(int32_t a, int32_t b);
// void process(int32_t *ptr);
Safety Wrappers
use std::ffi::{CStr, CString};
use std::ptr;
// Unsafe C function
extern "C" {
fn unsafe_c_func(ptr: *const u8, len: usize) -> i32;
}
// Safe wrapper
pub fn safe_wrapper(data: &[u8]) -> i32 {
unsafe {
unsafe_c_func(data.as_ptr(), data.len())
}
}
// Wrapper with error handling
pub fn wrapper_with_error(data: &str) -> Result<i32, &'static str> {
let c_str = CString::new(data).map_err(|_| "contains null byte")?;
let result = unsafe { unsafe_c_func(c_str.as_ptr() as *const u8, data.len()) };
match result {
0 => Ok(0),
-1 => Err("function failed"),
_ => Ok(result),
}
}
// Allocation wrapper
extern "C" {
fn c_malloc(size: usize) -> *mut u8;
fn c_free(ptr: *mut u8);
}
pub struct MallocBuffer(*mut u8, usize);
impl MallocBuffer {
pub fn new(size: usize) -> Option<Self> {
let ptr = unsafe { c_malloc(size) };
if ptr.is_null() {
None
} else {
Some(MallocBuffer(ptr, size))
}
}
pub fn as_slice(&self) -> &[u8] {
unsafe { std::slice::from_raw_parts(self.0, self.1) }
}
pub fn as_mut_slice(&mut self) -> &mut [u8] {
unsafe { std::slice::from_raw_parts_mut(self.0, self.1) }
}
}
impl Drop for MallocBuffer {
fn drop(&mut self) {
unsafe { c_free(self.0); }
}
}
repr(C) Layout
// repr(Rust) — Rust's optimized layout (default)
struct RustLayout {
a: u8, // 1 byte
b: u64, // 8 bytes
c: u32, // 4 bytes
}
// Rust reorders for cache alignment:
// [b: u64, c: u32, a: u8] (12 bytes + 3 padding)
// repr(C) — C-compatible layout (sequential)
#[repr(C)]
struct CLayout {
a: u8, // 1 byte
b: u64, // 7 bytes padding + 8 bytes
c: u32, // 4 bytes
}
// Exact order: [a: u8, padding: 7, b: u64, c: u32] (20 bytes)
// repr(transparent) — single field with same layout
#[repr(transparent)]
struct Wrapper(String);
// Wrapper has same layout as String (can transmute safely)
// Using repr(C) with FFI
extern "C" {
fn c_struct_process(data: *const CLayout);
}
fn main() {
let layout = CLayout { a: 1, b: 2, c: 3 };
unsafe {
c_struct_process(&layout);
}
}
// Checking layout with std::mem
use std::mem;
fn main() {
println!("Size: {}", mem::size_of::<CLayout>());
println!("Align: {}", mem::align_of::<CLayout>());
println!("Offset of b: {}", unsafe {
&(*(ptr::null::<CLayout>())).b as *const _ as usize
});
}
28. Custom Allocators & Memory
Global Allocator Trait
use std::alloc::{GlobalAlloc, Layout};
use std::ptr::NonNull;
// The GlobalAlloc trait
pub unsafe trait GlobalAlloc {
unsafe fn alloc(&self, layout: Layout) -> *mut u8;
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout);
// Optional overrides
unsafe fn realloc(
&self,
ptr: *mut u8,
old_layout: Layout,
new_layout: Layout,
) -> *mut u8 {
// Default: alloc + copy + dealloc
let new_ptr = self.alloc(new_layout);
if !new_ptr.is_null() {
std::ptr::copy_nonoverlapping(
ptr,
new_ptr,
old_layout.size().min(new_layout.size()),
);
self.dealloc(ptr, old_layout);
}
new_ptr
}
unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {
let ptr = self.alloc(layout);
if !ptr.is_null() {
std::ptr::write_bytes(ptr, 0, layout.size());
}
ptr
}
}
// Simple counting allocator
struct CountingAllocator;
unsafe impl GlobalAlloc for CountingAllocator {
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
let ret = std::alloc::System.alloc(layout);
if !ret.is_null() {
eprintln!("Allocated {} bytes", layout.size());
}
ret
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
std::alloc::System.dealloc(ptr, layout);
eprintln!("Deallocated {} bytes", layout.size());
}
}
// Set the global allocator
#[global_allocator]
static GLOBAL: CountingAllocator = CountingAllocator;
jemalloc & mimalloc Integration
// Using jemalloc — efficient memory allocator
// Cargo.toml:
// [dependencies]
// jemallocator = "0.5"
use jemallocator::Jemalloc;
#[global_allocator]
static GLOBAL: Jemalloc = Jemalloc;
fn main() {
let _v = vec![1, 2, 3];
// Allocated with jemalloc instead of System allocator
}
// Using mimalloc — Microsoft's fast allocator
// Cargo.toml:
// [dependencies]
// mimalloc = "0.1"
use mimalloc::MiMalloc;
#[global_allocator]
static GLOBAL: MiMalloc = MiMalloc;
fn main() {
let _v = vec![1, 2, 3];
// Allocated with mimalloc
}
// Measuring allocator performance
fn bench_allocator() {
use std::time::Instant;
let start = Instant::now();
let mut vecs = Vec::new();
for _ in 0..10000 {
vecs.push(vec![0u8; 1024]);
}
println!("Time: {:?}", start.elapsed());
}
// jemalloc and mimalloc are generally faster than System
// for fragmentation-heavy workloads
Arena Allocators (bumpalo)
use bumpalo::Bump;
// Arena allocator — allocate many objects, free all at once
fn process_with_arena() {
let arena = Bump::new();
// All allocations use the arena
let vec = arena.alloc(vec![1, 2, 3]);
let string = arena.alloc(String::from("hello"));
let boxed = arena.alloc(42i32);
println!("{:?}", vec);
println!("{}", string);
println!("{}", boxed);
// When arena goes out of scope, all allocations freed at once
// No individual drop() calls
}
// Useful for temporary allocations in tight loops
fn parse_many_strings(input: &str) -> Vec<String> {
let arena = Bump::new();
input
.lines()
.map(|line| {
let processed = arena.alloc(line.to_uppercase());
processed.clone() // Clone to escape arena lifetime
})
.collect()
}
// Custom struct with arena allocation
struct Parser<'a> {
arena: &'a Bump,
}
impl<'a> Parser<'a> {
fn new(arena: &'a Bump) -> Self {
Parser { arena }
}
fn allocate_string(&self, s: &str) -> &'a mut String {
self.arena.alloc(String::from(s))
}
}
fn main() {
let arena = Bump::new();
let parser = Parser::new(&arena);
let s = parser.allocate_string("hello");
s.push_str(" world");
println!("{}", s);
}
Memory Layout: Size, Align, Padding
use std::mem::{size_of, align_of};
use std::ptr;
// Basic layout
struct Simple {
a: u32, // 4 bytes, align 4
b: u8, // 1 byte, align 1
}
fn analyze_layout() {
println!("Size: {}", size_of::<Simple>()); // 8 (not 5!)
println!("Align: {}", align_of::<Simple>()); // 4
// Rust pads with 3 bytes between a and b for alignment
}
// Layout with explicit repr(C)
#[repr(C)]
struct CStyle {
a: u32, // offset 0, size 4
b: u8, // offset 4, size 1
c: u64, // offset 12 (8-byte align), size 8
}
fn c_layout() {
println!("Size: {}", size_of::<CStyle>()); // 24
// Padding:
// [a: 4 bytes] [padding: 4] [b: 1 byte] [padding: 7] [c: 8 bytes]
}
// Reordered for efficiency
struct Reordered {
a: u64, // 8 bytes, align 8
b: u32, // 4 bytes, align 4
c: u8, // 1 byte, align 1
}
fn optimal_layout() {
println!("Size: {}", size_of::<Reordered>()); // 16 (not 13)
// Rust reorders to: [a: u64] [b: u32] [c: u8, padding: 3]
}
// Calculating field offsets
fn field_offsets() {
#[repr(C)]
struct Data {
a: u32,
b: u64,
c: u32,
}
let base = &Data { a: 1, b: 2, c: 3 } as *const Data as usize;
let a_offset = &(*ptr::null::<Data>()).a as *const _ as usize;
let b_offset = &(*ptr::null::<Data>()).b as *const _ as usize;
let c_offset = &(*ptr::null::<Data>()).c as *const _ as usize;
println!("a at: {}", a_offset);
println!("b at: {}", b_offset);
println!("c at: {}", c_offset);
}
// Zero-sized types (ZST)
struct ZeroSized;
fn zst_info() {
println!("Size: {}", size_of::<ZeroSized>()); // 0
println!("Align: {}", align_of::<ZeroSized>()); // 1
}
// Array padding
struct Array {
items: [u64; 3], // 24 bytes, align 8
}
// Nested struct padding
struct Nested {
a: u32,
inner: Simple, // align 4, so 0 padding before it
}
alloc::Layout
use std::alloc::Layout;
use std::ptr::NonNull;
// Creating layouts
fn create_layouts() {
// For a single type
let layout = Layout::new::<u32>();
println!("u32: size {}, align {}", layout.size(), layout.align());
// For an array
let array_layout = Layout::new::<[u32; 10]>();
println!("Array: size {}, align {}", array_layout.size(), array_layout.align());
// Manual layout
let manual = Layout::from_size_align(16, 8).unwrap();
println!("Manual: size {}, align {}", manual.size(), manual.align());
// Extend a layout (for flexible array members)
let base = Layout::new::<u32>();
let (extended, _offset) = base.extend(Layout::new::<u8>()).unwrap();
println!("Extended: size {}", extended.size());
}
// Allocating with Layout
fn allocate_with_layout() {
let layout = Layout::new::<[u64; 10]>();
let ptr = unsafe {
std::alloc::alloc(layout) as *mut u64
};
if ptr.is_null() {
panic!("allocation failed");
}
unsafe {
// Initialize
for i in 0..10 {
*ptr.add(i) = i as u64;
}
// Use
println!("{}", *ptr.add(5));
// Deallocate
std::alloc::dealloc(ptr as *mut u8, layout);
}
}
// Layout for variadic structs
fn flexible_array() {
#[repr(C)]
struct Header {
id: u32,
count: u32,
}
fn allocate_with_data(count: usize) -> *mut Header {
let header_layout = Layout::new::<Header>();
let data_layout = Layout::array::<u64>(count).unwrap();
let (layout, offset) = header_layout.extend(data_layout).unwrap();
let ptr = unsafe { std::alloc::alloc(layout) as *mut Header };
unsafe {
(*ptr).id = 1;
(*ptr).count = count as u32;
// Data starts at ptr + offset
let data = (ptr as *mut u8).add(offset) as *mut u64;
for i in 0..count {
*data.add(i) = i as u64;
}
}
ptr
}
}
Custom Collections with Allocators
use std::alloc::{GlobalAlloc, Layout};
use std::ptr::NonNull;
// Custom vector-like collection using a specific allocator
struct CustomVec<T, A: GlobalAlloc> {
ptr: NonNull<T>,
capacity: usize,
len: usize,
allocator: A,
}
impl<T, A: GlobalAlloc> CustomVec<T, A> {
fn new(allocator: A) -> Self {
CustomVec {
ptr: NonNull::dangling(),
capacity: 0,
len: 0,
allocator,
}
}
fn push(&mut self, value: T) {
if self.len == self.capacity {
self.grow();
}
unsafe {
let ptr = self.ptr.as_mut() as *mut T;
std::ptr::write(ptr.add(self.len), value);
}
self.len += 1;
}
fn grow(&mut self) {
let new_capacity = if self.capacity == 0 { 1 } else { self.capacity * 2 };
let new_layout = Layout::array::<T>(new_capacity).unwrap();
let new_ptr = if self.capacity == 0 {
unsafe { self.allocator.alloc(new_layout) }
} else {
let old_layout = Layout::array::<T>(self.capacity).unwrap();
unsafe {
self.allocator.realloc(
self.ptr.as_mut() as *mut u8,
old_layout,
new_layout,
)
}
};
self.ptr = NonNull::new(new_ptr as *mut T).expect("allocation failed");
self.capacity = new_capacity;
}
fn pop(&mut self) -> Option<T> {
if self.len == 0 {
None
} else {
self.len -= 1;
unsafe { Some(std::ptr::read(self.ptr.as_ptr().add(self.len))) }
}
}
}
impl<T, A: GlobalAlloc> Drop for CustomVec<T, A> {
fn drop(&mut self) {
if self.capacity != 0 {
unsafe {
for i in 0..self.len {
std::ptr::drop_in_place(self.ptr.as_ptr().add(i));
}
let layout = Layout::array::<T>(self.capacity).unwrap();
self.allocator.dealloc(self.ptr.as_mut() as *mut u8, layout);
}
}
}
}
Zero-Copy Patterns
use std::mem;
// Pattern 1: Direct buffer reinterpretation
fn bytes_to_u32s(bytes: &[u8]) -> &[u32] {
assert_eq!(bytes.len() % 4, 0);
assert_eq!(bytes.as_ptr() as usize % 4, 0); // Check alignment
unsafe {
std::slice::from_raw_parts(
bytes.as_ptr() as *const u32,
bytes.len() / 4,
)
}
}
// Pattern 2: Transmute for layout-compatible types
#[repr(C)]
struct Bytes4 {
a: u8,
b: u8,
c: u8,
d: u8,
}
#[repr(transparent)]
struct U32Wrapper(u32);
fn transmute_zero_copy(b: Bytes4) -> U32Wrapper {
unsafe { mem::transmute(b) }
}
// Pattern 3: mmap files as structures
fn load_from_file() {
// In real code, use memmap crate
// let mmap = Mmap::open("file.bin").unwrap();
// let data: &[Header] = unsafe {
// std::slice::from_raw_parts(
// mmap.as_ptr() as *const Header,
// mmap.len() / size_of::<Header>(),
// )
// };
}
// Pattern 4: Endian-aware zero-copy reading
fn read_u32_le(bytes: &[u8]) -> u32 {
let mut buf = [0u8; 4];
buf.copy_from_slice(&bytes[..4]);
u32::from_le_bytes(buf)
}
// Pattern 5: Reference casting without copy
struct NetworkPacket {
data: [u8; 100],
}
fn parse_packet(packet: &NetworkPacket) {
let header: &[u8; 4] = unsafe {
&*(packet.data.as_ptr() as *const [u8; 4])
};
println!("Header: {:?}", header);
}
29. SIMD & Inline Assembly
std::arch Intrinsics
// SIMD using x86 intrinsics
#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;
#[cfg(target_arch = "x86_64")]
fn simd_add_i32() {
unsafe {
// Create SIMD vectors
let a = _mm_setr_epi32(1, 2, 3, 4);
let b = _mm_setr_epi32(5, 6, 7, 8);
// Add all 4 i32s in parallel
let c = _mm_add_epi32(a, b);
// Extract results
let result = [
_mm_extract_epi32::<0>(c),
_mm_extract_epi32::<1>(c),
_mm_extract_epi32::<2>(c),
_mm_extract_epi32::<3>(c),
];
println!("{:?}", result); // [6, 8, 10, 12]
}
}
// Floating point SIMD
#[cfg(target_arch = "x86_64")]
fn simd_multiply_f32() {
unsafe {
let a = _mm_setr_ps(1.0, 2.0, 3.0, 4.0);
let b = _mm_setr_ps(2.0, 2.0, 2.0, 2.0);
let c = _mm_mul_ps(a, b);
println!("{:?}", [
_mm_cvtss_f32(c),
_mm_cvtss_f32(_mm_shuffle_ps::<0b01010101>(c, c)),
]);
}
}
// SIMD memory operations (SSE4.2)
#[cfg(all(target_arch = "x86_64", target_feature = "sse4.2"))]
fn simd_memcpy(src: &[u8], dst: &mut [u8]) {
unsafe {
for i in (0..src.len()).step_by(16) {
let data = _mm_loadu_si128(src.as_ptr().add(i) as *const __m128i);
_mm_storeu_si128(dst.as_mut_ptr().add(i) as *mut __m128i, data);
}
}
}
Portable SIMD (std::simd — Nightly)
#![feature(portable_simd)]
use std::simd::{Simd, SimdInt, SimdFloat};
fn portable_simd_add() {
// Create SIMD vectors portable across architectures
let a: Simd<i32, 4> = Simd::from_array([1, 2, 3, 4]);
let b: Simd<i32, 4> = Simd::from_array([5, 6, 7, 8]);
let c = a + b;
println!("{:?}", c.to_array()); // [6, 8, 10, 12]
}
fn portable_simd_float() {
let a: Simd<f32, 4> = Simd::from_array([1.0, 2.0, 3.0, 4.0]);
let b: Simd<f32, 4> = Simd::from_array([0.1, 0.2, 0.3, 0.4]);
let c = a + b;
let d = a.abs();
let e = a.sqrt();
println!("Add: {:?}", c.to_array());
println!("Abs: {:?}", d.to_array());
println!("Sqrt: {:?}", e.to_array());
}
fn simd_reduction() {
let v: Simd<i32, 4> = Simd::from_array([1, 2, 3, 4]);
let sum = v.reduce_sum(); // 10
let max = v.reduce_max(); // 4
let min = v.reduce_min(); // 1
println!("Sum: {}, Max: {}, Min: {}", sum, max, min);
}
fn simd_comparison() {
let a: Simd<i32, 4> = Simd::from_array([1, 2, 3, 4]);
let b: Simd<i32, 4> = Simd::from_array([2, 2, 2, 2]);
let mask = a.simd_lt(b); // Element-wise <
// mask represents: [true, false, false, false]
}
#[target_feature] & CPU Feature Detection
// Function requiring specific CPU feature
#[target_feature(enable = "avx2")]
unsafe fn avx2_sum(v: &[f32]) -> f32 {
use std::arch::x86_64::*;
let mut sum = _mm256_setzero_ps();
for chunk in v.chunks_exact(8) {
let data = _mm256_loadu_ps(chunk.as_ptr());
sum = _mm256_add_ps(sum, data);
}
// Horizontal reduction
let result = _mm256_extractf128_ps::<1>(sum);
_mm_cvtss_f32(_mm_hadd_ps(result, result))
}
// Runtime detection
fn main() {
#[cfg(target_arch = "x86_64")]
{
if std::arch::x86_64::has_cpuid() {
// Detect features at runtime
if std::is_x86_feature_detected!("avx2") {
let sum = unsafe { avx2_sum(&[1.0; 16]) };
println!("AVX2 sum: {}", sum);
} else {
println!("AVX2 not supported");
}
}
}
}
// Compile-time features
// Cargo.toml:
// [profile.release]
// rustflags = ["-C", "target-feature=+avx2"]
// Check at compile time
#[cfg(target_feature = "avx2")]
fn avx2_enabled() {
println!("Compiled with AVX2");
}
#[cfg(not(target_feature = "avx2"))]
fn avx2_disabled() {
println!("No AVX2");
}
asm! Macro for Inline Assembly
// Basic inline assembly
fn add_asm(a: i32, b: i32) -> i32 {
let result: i32;
unsafe {
asm!(
"add {}, {}",
inout(reg) a => result,
in(reg) b,
);
}
result
}
// Memory-mapped I/O
unsafe fn read_register(addr: *const u32) -> u32 {
let value: u32;
asm!(
"mov {}, [{}]",
out(reg) value,
in(reg) addr,
);
value
}
unsafe fn write_register(addr: *mut u32, value: u32) {
asm!(
"mov [{}], {}",
in(reg) addr,
in(reg) value,
);
}
// Clobbering registers
fn cpu_id() -> u32 {
let eax: u32;
let ebx: u32;
let ecx: u32;
let edx: u32;
unsafe {
asm!(
"cpuid",
inout("eax") 1u32 => eax,
out("ebx") ebx,
out("ecx") ecx,
out("edx") edx,
);
}
eax
}
// Inline assembly with options
fn barrier() {
unsafe {
asm!(
"mfence",
options(nomem, nostack),
);
}
}
// Volatile assembly (always emitted)
fn spinloop() {
unsafe {
asm!(
"jmp {0}",
in(label) loop_label,
options(noreturn),
);
loop_label:
unreachable!();
}
}
Performance Examples
use std::time::Instant;
// SIMD vs scalar loop
fn scalar_sum(data: &[f32]) -> f32 {
data.iter().sum()
}
#[cfg(target_arch = "x86_64")]
#[target_feature(enable = "avx2")]
unsafe fn simd_sum(data: &[f32]) -> f32 {
use std::arch::x86_64::*;
let mut sum = _mm256_setzero_ps();
for chunk in data.chunks_exact(8) {
let v = _mm256_loadu_ps(chunk.as_ptr());
sum = _mm256_add_ps(sum, v);
}
// Horizontal sum
let mut arr = [0.0f32; 8];
_mm256_storeu_ps(arr.as_mut_ptr(), sum);
arr.iter().sum()
}
fn bench() {
let data: Vec<f32> = (0..10_000).map(|i| i as f32).collect();
let start = Instant::now();
let scalar = scalar_sum(&data);
let scalar_time = start.elapsed();
#[cfg(target_arch = "x86_64")]
{
if is_x86_feature_detected!("avx2") {
let start = Instant::now();
let simd = unsafe { simd_sum(&data) };
let simd_time = start.elapsed();
println!("Scalar: {:?}, SIMD: {:?}", scalar_time, simd_time);
println!("Speedup: {:.2}x", scalar_time.as_nanos() as f32 / simd_time.as_nanos() as f32);
}
}
}
30. Cargo & Build System Deep Dive
Cargo.toml Anatomy
[package]
name = "my_project"
version = "0.1.0"
edition = "2021"
authors = ["You"]
description = "A great library"
license = "MIT"
repository = "https://github.com/user/repo"
homepage = "https://example.com"
documentation = "https://docs.rs/my_project"
keywords = ["parsing", "cli"]
categories = ["command-line-utilities"]
readme = "README.md"
rust-version = "1.70" # MSRV
[lib]
name = "my_lib"
path = "src/lib.rs"
crate-type = ["rlib"] # or "cdylib", "staticlib", "dylib"
[[bin]]
name = "my_cli"
path = "src/main.rs"
required-features = ["cli"]
[dependencies]
serde = "1.0"
tokio = { version = "1.0", features = ["full"] }
my_crate = { path = "../my_crate" }
git_crate = { git = "https://github.com/user/repo", branch = "main" }
[dev-dependencies]
criterion = "0.5"
proptest = "1.0"
[build-dependencies]
bindgen = "0.69"
[target.'cfg(unix)'.dependencies]
libc = "0.2"
[target.'cfg(windows)'.dependencies]
windows = "0.52"
[features]
default = ["cli", "serde"]
cli = ["clap"]
performance = ["simd"]
serde = ["dep:serde"]
serde-support = ["serde"] # Feature dependency
[profile.release]
opt-level = 3
lto = true
codegen-units = 1
strip = true
[profile.bench]
inherits = "release"
debug = true
[package.metadata.cargo-binstall]
pkg-url = "https://example.com/{name}-{version}.tar.gz"
Features & Conditional Compilation
// In Cargo.toml:
[features]
default = ["networking"]
networking = []
logging = ["tracing"]
compression = ["flate2"]
all-features = ["networking", "logging", "compression"]
// In code:
#[cfg(feature = "networking")]
mod network {
pub fn connect() {
println!("Connecting...");
}
}
#[cfg(feature = "logging")]
fn log(msg: &str) {
println!("{}", msg);
}
#[cfg(not(feature = "logging"))]
fn log(_msg: &str) {}
// Feature combination logic
#[cfg(all(feature = "networking", feature = "logging"))]
fn setup() {
log("Setting up network...");
}
#[cfg(any(feature = "compression", feature = "async"))]
async fn compress_or_process() {}
// Build with features:
// cargo build --features "networking,logging"
// cargo build --all-features
// cargo build --no-default-features
// cargo build --no-default-features --features "logging"
cfg Attributes
| Attribute | Purpose | Example |
|---|---|---|
#[cfg(test)] | Only include in test builds | #[cfg(test)] mod tests {} |
#[cfg(debug_assertions)] | Only in debug builds | #[cfg(debug_assertions)] println!("..."); |
#[cfg(target_os = "linux")] | OS-specific code | #[cfg(target_os = "windows")] fn win_func() {} |
#[cfg(target_arch = "x86_64")] | Architecture-specific | #[cfg(target_arch = "wasm32")] |
#[cfg(feature = "foo")] | Feature-gated code | #[cfg(feature = "serde")] |
#[cfg_attr(test, ignore)] | Conditional attributes | Skip test conditionally |
// Platform-specific code
#[cfg(unix)]
fn get_home_dir() -> String {
std::env::var("HOME").unwrap()
}
#[cfg(windows)]
fn get_home_dir() -> String {
std::env::var("USERPROFILE").unwrap()
}
// Debug vs release
#[cfg(debug_assertions)]
fn expensive_check(x: i32) -> bool {
println!("Checking {}", x);
x > 0
}
#[cfg(not(debug_assertions))]
fn expensive_check(x: i32) -> bool {
x > 0 // Optimized away in release
}
// Test-only code
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_something() {
assert_eq!(1 + 1, 2);
}
}
// Conditional attributes
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
struct MyStruct {
x: i32,
}
// Compile-time configuration
const IS_DEBUG: bool = cfg!(debug_assertions);
Workspaces
// Workspace root (Cargo.toml)
[workspace]
members = [
"crates/core",
"crates/cli",
"crates/web",
]
exclude = ["crates/experimental"]
[workspace.package]
version = "0.2.0"
edition = "2021"
authors = ["Team"]
license = "MIT"
// Shared dependencies
[workspace.dependencies]
tokio = { version = "1.0", features = ["full"] }
serde = "1.0"
// Individual crate: crates/core/Cargo.toml
[package]
name = "my-core"
version.workspace = true
edition.workspace = true
authors.workspace = true
[dependencies]
tokio = { workspace = true }
serde = { workspace = true }
// Usage:
// cargo build -w (build all)
// cargo build -p my-core (specific crate)
// cargo test -w (test all)
// cargo publish -p my-cli
Build Scripts (build.rs)
// build.rs in crate root
use std::env;
use std::path::PathBuf;
fn main() {
// Print build information
println!("cargo:warning=Building for {}", env::var("PROFILE").unwrap());
// Link C libraries
println!("cargo:rustc-link-lib=mylib");
println!("cargo:rustc-link-search=/usr/local/lib");
// Conditional linking
let target = env::var("TARGET").unwrap();
if target.contains("x86_64") {
println!("cargo:rustc-link-search=/opt/simd/lib");
}
// Define cfg
println!("cargo:rustc-cfg=has_custom_build");
// Rerun if changed
println!("cargo:rerun-if-changed=build.rs");
println!("cargo:rerun-if-changed=src/ffi.h");
println!("cargo:rerun-if-env-changed=MYLIB_PATH");
// Generate code
#[cfg(feature = "codegen")]
{
let out_dir = PathBuf::from(env::var("OUT_DIR").unwrap());
let dest = out_dir.join("generated.rs");
std::fs::write(&dest, "pub const GENERATED: &str = \"hello\";").unwrap();
println!("cargo:rustc-env=GENERATED_PATH={}", dest.display());
}
}
// In src/lib.rs:
#[cfg(has_custom_build)]
println!("Custom build was used!");
// Include generated code
include!(concat!(env!("OUT_DIR"), "/generated.rs"));
Custom Profiles
// Cargo.toml
[profile.release]
opt-level = 3
lto = true
codegen-units = 1
strip = true
panic = "abort"
[profile.bench]
inherits = "release"
debug = true
[profile.custom]
inherits = "release"
split-debuginfo = "packed"
opt-level = 2
[profile.dev]
opt-level = 0
debug = true
split-debuginfo = "unpacked"
// Usage:
// cargo build (uses dev)
// cargo build --release
// cargo build --profile custom
// cargo bench (uses bench)
// Profile options:
// opt-level: 0-3, z (size), s (small)
// lto: true, fat, thin, false
// codegen-units: 1-256 (1=slower build, best optimization)
// panic: unwind, abort
// strip: false, true, symbols, debuginfo
// split-debuginfo: packed, unpacked, off
// debug: 0 (none), 1 (limited), 2 (full)
// incremental: true/false
Cross Compilation
// Install cross
// cargo install cross
// Build for different target
// cross build --target x86_64-unknown-linux-musl
// cross build --target aarch64-unknown-linux-gnu
// cross build --target wasm32-unknown-unknown
// cross build --target x86_64-pc-windows-gnu
// Rustup targets
// rustup target list
// rustup target add wasm32-unknown-unknown
// .cargo/config.toml
[build]
target = "x86_64-unknown-linux-musl"
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
// Conditional dependencies for targets
[target.'cfg(unix)'.dependencies]
libc = "0.2"
[target.'cfg(target_os = "windows")'.dependencies]
winapi = "0.3"
[target.'cfg(target_arch = "wasm32")'.dependencies]
wasm-bindgen = "0.2"
// Profile settings per target
[profile.release]
opt-level = 3
[profile.dev]
split-debuginfo = "unpacked"
31. Modules, Visibility & Project Structure
mod, pub, Visibility Levels
// Private by default (no keyword)
fn private_function() {}
// pub — public everywhere
pub fn public_function() {}
// pub(crate) — public in this crate only
pub(crate) fn internal_function() {}
// pub(super) — public to parent module
pub(super) fn parent_visible_function() {}
// pub(in path::to::module) — public to specific module
pub(in crate::sibling) fn sibling_visible() {}
// Visibility on struct fields
pub struct MyStruct {
pub public_field: i32,
private_field: String, // Private
pub(crate) crate_field: bool,
}
// Visibility on enum variants
pub enum Status {
pub Ready, // Can't put pub on variants directly
Processing, // Enum itself is public; all variants inherit visibility
Error(String),
}
// Trait visibility
pub trait MyTrait {
fn public_method(&self); // Public with trait
fn _private_method(&self) {} // Convention: leading _
}
// Trait impl visibility
impl MyTrait for MyType {
fn public_method(&self) {}
}
// impl block visibility matches the trait/type visibility
// Re-exports with pub use
pub use std::collections::HashMap;
// Publicly re-exports HashMap from this module
Module Tree vs File Tree
// File structure:
// src/
// lib.rs
// math/
// mod.rs
// algebra.rs
// geometry.rs
// network/
// mod.rs
// http.rs
// tcp.rs
// lib.rs declares module structure
pub mod math;
pub mod network;
// This tells Rust to load src/math/mod.rs or src/math.rs
// src/math/mod.rs
pub mod algebra;
pub mod geometry;
pub fn calculate() {}
// src/math/algebra.rs
pub fn solve_equation() {}
// Access paths:
// crate::math::calculate()
// crate::math::algebra::solve_equation()
// crate::math::geometry::... (if declared in mod.rs)
// Old-style inline modules
mod inline_module {
pub fn inner() {}
}
fn use_inline() {
inline_module::inner();
}
// File as module directory (src/network/mod.rs)
pub mod http;
pub mod tcp;
pub fn connect() {}
// Access: crate::network::connect(), crate::network::http::...
// File as module (src/network.rs) — NO submodules
pub fn connect() {}
// Access: crate::network::connect()
// Can't have network::http submodule without network/mod.rs
use & Re-exports
// Importing specific items
use std::collections::HashMap;
use std::fs::File;
// Glob imports
use std::io::*; // Imports Read, Write, etc.
// Aliasing
use std::collections::HashMap as HMap;
// Multiple imports
use std::io::{self, Read, Write};
// self imports the io module itself
// Nested imports
use std::fs::{
self,
File,
OpenOptions,
};
// Relative imports (crate-local)
use super::parent_module;
use super::super::grandparent;
use crate::root_module;
use self::sibling;
// Re-exporting
pub use std::collections::HashMap;
// User can do: use my_crate::HashMap
// Selective re-export
pub use crate::internal::process;
// Hides internal module, exposes process function
// Re-export with alias
pub use std::io::Result as IoResult;
// Glob re-export
pub use crate::utils::*;
// Conditional re-exports
#[cfg(feature = "serde")]
pub use serde::{Serialize, Deserialize};
Prelude Pattern
// src/lib.rs or src/prelude.rs
pub mod prelude {
// Common traits users will want
pub use crate::traits::{MyTrait, AnotherTrait};
// Common types
pub use crate::types::{Result, Error};
// Frequently used functions
pub use crate::builder::Builder;
}
// User code:
use my_crate::prelude::*;
// Now they have:
// - MyTrait, AnotherTrait
// - Result, Error
// - Builder
// Standard library examples:
use std::prelude::v1::*; // Default for every Rust program
// Imports: Sized, Sync, Send, Unpin, Drop, Clone, Copy,
// Default, Eq, Ord, AsRef, AsMut, From, Into,
// IntoIterator, Iterator, DoubleEndedIterator,
// ExactSizeIterator, ToString, Option, Result, Vec, String, etc.
// Creating a minimal prelude for a library
pub mod prelude {
pub use crate::{
Database,
Query,
ConnectionPool,
Error,
Result,
};
}
Library vs Binary Crate Structure
| Aspect | Library Crate | Binary Crate |
|---|---|---|
| Entry point | src/lib.rs | src/main.rs |
| Output | .rlib (or shared lib) | Executable |
| Visibility | Public API exported | Internal organization |
| Dependencies | Minimal; exported to users | Can have any dependencies |
| Multiple binaries | N/A (one library) | src/bin/*.rs |
| Tests | tests/ directory | Same, plus internal tests |
| Example | Show library usage | N/A |
// Cargo.toml for library
[package]
name = "my_lib"
[lib]
name = "my_lib"
// src/lib.rs — the public API
pub mod prelude;
pub mod error;
pub mod types;
pub mod builder;
pub use error::{Error, Result};
pub use types::Config;
pub use builder::Builder;
// Library doesn't need main.rs; structure is internal
// Cargo.toml for binary
[package]
name = "my_cli"
[[bin]]
name = "my_cli"
path = "src/main.rs"
// src/main.rs — entry point
mod cli;
mod commands;
mod config;
use cli::App;
fn main() {
let app = App::new();
app.run();
}
// src/cli.rs — internal module
pub struct App;
// src/commands/ — internal module structure
pub fn handle_command() {}
// Library + Binary combo
[package]
name = "my_project"
[lib]
name = "my_lib"
[[bin]]
name = "my_cli"
// src/lib.rs — public library
pub mod processor;
pub mod config;
// src/main.rs — uses library
use my_lib::processor;
use my_lib::config;
fn main() {
let config = config::load().unwrap();
processor::run(&config);
}
Common Project Layouts
// Cargo.toml for this structure
[package]
name = "my_crate"
version = "0.1.0"
[lib]
name = "my_crate"
[[example]]
name = "basic"
[[bench]]
name = "processing"
harness = false
// src/lib.rs
pub mod prelude;
pub mod error;
pub mod core;
pub mod utils;
pub use error::{Error, Result};
pub use core::Processor;
// src/error.rs
pub type Result<T> = std::result::Result<T, Error>;
#[derive(Debug)]
pub enum Error {
IoError(std::io::Error),
ParseError(String),
}
// src/core/mod.rs
pub mod processor;
pub mod state;
pub use processor::Processor;
pub use state::State;
// src/core/processor.rs
pub struct Processor;
impl Processor {
pub fn process(&self, data: &str) -> crate::Result<()> {
todo!()
}
}
// examples/basic.rs (compiled separately)
use my_crate::prelude::*;
fn main() {
let proc = Processor::new();
let _ = proc.process("data");
}
// tests/test_api.rs
use my_crate::prelude::*;
#[test]
fn test_processor() {
let proc = Processor::new();
assert!(proc.process("valid").is_ok());
}
32. Embedded Rust & no_std
#![no_std] & #![no_main]
// Bare-metal binary without OS runtime
#![no_std]
#![no_main]
use core::panic::PanicInfo;
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {}
}
#[no_mangle]
pub extern "C" fn _start() -> ! {
loop {}
}
// With no_std, unavailable:
// - std::io, std::fs, std::net
// - std::vec, std::string (use alloc crate instead)
// - std::thread, std::sync (no_std compatible: core::sync::atomic)
// - System memory allocation
// With no_std, available:
// - core:: primitives (core::option, core::result, core::iter)
// - Core traits (Copy, Clone, Drop, Fn)
// - core::mem, core::ptr, core::fmt
// - core::sync::atomic for lock-free synchronization
// no_std with alloc support
#![no_std]
extern crate alloc;
use alloc::vec::Vec;
use alloc::string::String;
fn work() {
let mut v = Vec::new();
v.push(1);
let s = String::from("hello");
}
// Requires a global allocator
use alloc::alloc::GlobalAlloc;
struct MyAllocator;
unsafe impl GlobalAlloc for MyAllocator {
unsafe fn alloc(&self, layout: alloc::alloc::Layout) -> *mut u8 {
todo!()
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: alloc::alloc::Layout) {
todo!()
}
}
#[global_allocator]
static ALLOC: MyAllocator = MyAllocator;
embedded-hal Traits
use embedded_hal::digital::{InputPin, OutputPin};
use embedded_hal::serial::{Read, Write};
use embedded_hal::spi::Transfer;
// GPIO abstraction
trait OutputPin {
type Error;
fn set_low(&mut self) -> Result<(), Self::Error>;
fn set_high(&mut self) -> Result<(), Self::Error>;
}
trait InputPin {
type Error;
fn is_high(&self) -> Result<bool, Self::Error>;
fn is_low(&self) -> Result<bool, Self::Error>;
}
// Example: STM32 LED blink (conceptual)
struct Led<P: OutputPin> {
pin: P,
}
impl<P: OutputPin> Led<P> {
fn new(pin: P) -> Self {
Led { pin }
}
fn on(&mut self) -> Result<(), P::Error> {
self.pin.set_high()
}
fn off(&mut self) -> Result<(), P::Error> {
self.pin.set_low()
}
}
// UART trait
trait Serial<Word = u8>: Read<Word> + Write<Word> {}
fn generic_uart_write<S: Serial>(serial: &mut S, data: &[u8]) -> Result<(), S::Error> {
for &byte in data {
serial.write(byte)?;
}
Ok(())
}
// SPI trait
trait Transfer<Word = u8> {
type Error;
fn transfer(&mut self, words: &mut [Word]) -> Result<(), Self::Error>;
}
fn spi_read<S: Transfer<u8>>(spi: &mut S, len: usize) -> Result<Vec<u8>, S::Error> {
let mut buf = vec![0u8; len];
spi.transfer(&mut buf)?;
Ok(buf)
}
// Timer trait
trait CountDown {
type Time;
type Error;
fn start<T>(&mut self, count: T) -> Result<(), Self::Error>
where
T: Into<Self::Time>;
fn wait(&mut self) -> Result<(), Self::Error>;
}
fn delay<T: CountDown>(timer: &mut T, ms: u32) -> Result<(), T::Error> {
timer.start(ms)?;
timer.wait()
}
Cortex-M & RISC-V Targets
// For ARM Cortex-M microcontrollers
// Cargo.toml:
[package]
[dependencies]
cortex-m = "0.7"
cortex-m-rt = "0.7"
[profile.release]
opt-level = "z"
[build]
target = "thumbv7em-none-eabihf" # ARM Cortex-M4/M7 (hard float)
# or thumbv6m-none-eabi for M0/M0+
// Cortex-M startup
#![no_std]
#![no_main]
use cortex_m_rt::entry;
#[entry]
fn main() -> ! {
loop {}
}
// RISC-V targets
// [build]
// target = "riscv32i-unknown-none-elf"
// target = "riscv64imac-unknown-none-elf"
#![no_std]
#![no_main]
use riscv_rt::entry;
#[entry]
fn main() -> ! {
// RISC-V main
loop {}
}
// Accessing memory-mapped registers (Cortex-M example)
use cortex_m::peripheral::NVIC;
fn enable_interrupt() {
unsafe {
NVIC::unmask(cortex_m::peripheral::Interrupt::TIM2);
}
}
// RISC-V machine register access
use riscv::register::*;
fn read_cycle() -> u64 {
cycle::read() as u64
}
Interrupt Handling
// Cortex-M interrupt handlers
use cortex_m_rt::interrupt;
#[interrupt]
fn SysTick() {
// System timer interrupt
// Handled automatically
}
#[interrupt]
fn UART0() {
// UART interrupt
// Process serial data
}
// RISC-V interrupts
#[riscv_rt::entry]
fn main() -> ! {
unsafe {
riscv::interrupt::enable();
}
loop {}
}
#[allow(non_snake_case)]
#[no_mangle]
pub fn ExceptionHandler(trap_frame: &riscv_rt::TrapFrame) {
match riscv::register::scause::read().cause() {
riscv::register::scause::Trap::Interrupt(i) => {
match i {
riscv::register::scause::Interrupt::MachineTimer => {
// Handle timer
}
_ => {}
}
}
riscv::register::scause::Trap::Exception(e) => {
match e {
riscv::register::scause::Exception::IllegalInstruction => {
// Handle illegal instruction
}
_ => {}
}
}
}
}
// Cortex-M with critical sections
use cortex_m::interrupt;
fn critical_section() {
interrupt::free(|cs| {
// No interrupts during this block
// cs: CriticalSection token
perform_atomic_operation();
});
}
fn perform_atomic_operation() {
// Protected from interrupts
}
Memory-Mapped I/O
use core::ptr::{read_volatile, write_volatile};
use core::sync::atomic::{AtomicU32, Ordering};
// Register structure
#[repr(C)]
struct UartRegs {
data: u32, // Offset 0x0
status: u32, // Offset 0x4
control: u32, // Offset 0x8
}
// Bare register access (unsafe)
fn uart_read_char(uart_base: *const UartRegs) -> u8 {
unsafe {
let data = read_volatile(&(*uart_base).data);
(data & 0xFF) as u8
}
}
fn uart_write_char(uart_base: *mut UartRegs, ch: u8) {
unsafe {
write_volatile(&mut (*uart_base).data, ch as u32);
}
}
// Volatile wrapper for registers
struct Uart {
base: *mut UartRegs,
}
impl Uart {
fn new(base: usize) -> Self {
Uart { base: base as *mut _ }
}
fn read(&self) -> u8 {
unsafe {
let data = read_volatile(&(*self.base).data);
(data & 0xFF) as u8
}
}
fn write(&mut self, ch: u8) {
unsafe {
write_volatile(&mut (*self.base).data, ch as u32);
}
}
fn is_ready(&self) -> bool {
unsafe {
let status = read_volatile(&(*self.base).status);
(status & 1) != 0
}
}
}
// Atomic register access (for shared mutable state)
struct AtomicReg(AtomicU32);
fn atomic_read(reg: &AtomicReg) -> u32 {
reg.0.load(Ordering::SeqCst)
}
fn atomic_write(reg: &mut AtomicReg, value: u32) {
reg.0.store(value, Ordering::SeqCst);
}
Real-Time Constraints
// Predictable execution time (no allocations in hot path)
fn process_sensor_data_deterministic(data: &[u8]) {
// Stack-allocated buffer, no heap
let mut output = [0u8; 256];
// Fixed iterations, no loops with unknown count
for i in 0..data.len().min(256) {
output[i] = process(data[i]);
}
}
// WCET (Worst-Case Execution Time) aware code
fn compute_with_bounds() {
// Avoid:
// - malloc/Vec allocations
// - system calls
// - loops with data-dependent iteration counts
// - branches that depend on input (timing attacks)
// Prefer:
// - Stack-allocated fixed buffers
// - Loop unrolling
// - Fixed WCET operations
let mut sum = 0u32;
for &val in &[1, 2, 3, 4] {
sum = sum.wrapping_add(val);
}
}
// Soft real-time with spinlock
use core::sync::atomic::{AtomicBool, Ordering};
struct Mutex(AtomicBool);
impl Mutex {
fn lock(&self) {
while self.0.compare_exchange(
false,
true,
Ordering::Acquire,
Ordering::Relaxed,
).is_err() {
// Spinlock
}
}
fn unlock(&self) {
self.0.store(false, Ordering::Release);
}
}
// Priority inversion awareness
fn high_priority_task() {
// Should not hold locks for long
// Should not call into low-priority code
}
fn low_priority_task() {
// Can use locks safely
// Won't block high-priority tasks
}
33. WebAssembly (Wasm)
wasm-pack & wasm-bindgen
// Cargo.toml for WASM library
[package]
name = "wasm-lib"
[lib]
crate-type = ["cdylib", "rlib"]
[dependencies]
wasm-bindgen = "0.2"
web-sys = "0.3"
js-sys = "0.3"
// src/lib.rs
use wasm_bindgen::prelude::*;
// Exported to JavaScript
#[wasm_bindgen]
pub fn greet(name: &str) -> String {
format!("Hello, {}!", name)
}
// JavaScript interop
#[wasm_bindgen(js_namespace = console)]
extern "C" {
pub fn log(s: &str);
}
#[wasm_bindgen]
pub fn hello_world() {
log("Hello from Rust!");
}
// Custom types with JavaScript
#[wasm_bindgen]
pub struct Counter {
count: i32,
}
#[wasm_bindgen]
impl Counter {
#[wasm_bindgen(constructor)]
pub fn new() -> Counter {
Counter { count: 0 }
}
pub fn increment(&mut self) {
self.count += 1;
}
pub fn get_count(&self) -> i32 {
self.count
}
}
// Build:
// wasm-pack build --target web
// Generates pkg/ directory with WASM + JS bindings
// JavaScript usage:
// import init, { Counter } from "./pkg/wasm_lib.js";
//
// await init();
// const counter = new Counter();
// counter.increment();
// console.log(counter.get_count());
wasm32-unknown-unknown Target
// Setup
// rustup target add wasm32-unknown-unknown
// Cargo.toml
[lib]
crate-type = ["cdylib"]
// Build bare WASM (no bindings)
// cargo build --target wasm32-unknown-unknown --release
// The resulting WASM file in target/wasm32-unknown-unknown/release/
// Manual WASM export (without wasm-bindgen)
#[no_mangle]
pub extern "C" fn add(a: i32, b: i32) -> i32 {
a + b
}
// Load in JavaScript:
// const wasm = await WebAssembly.instantiateStreaming(
// fetch('module.wasm')
// );
// const result = wasm.instance.exports.add(5, 3);
// Memory access
#[no_mangle]
pub extern "C" fn get_memory_ptr() -> *const u8 {
&[1, 2, 3, 4, 5][0]
}
// Linear memory (WASM's memory model)
static mut BUFFER: [u8; 1024] = [0; 1024];
#[no_mangle]
pub extern "C" fn buffer_addr() -> *const u8 {
unsafe { &BUFFER[0] }
}
#[no_mangle]
pub extern "C" fn process_buffer(len: usize) {
unsafe {
for i in 0..len.min(1024) {
BUFFER[i] = BUFFER[i].wrapping_add(1);
}
}
}
JS Interop
use wasm_bindgen::prelude::*;
use web_sys::window;
// Call JavaScript functions
#[wasm_bindgen]
pub fn call_js_function() {
web_sys::window()
.and_then(|w| w.document())
.and_then(|d| d.get_element_by_id("output"))
.and_then(|e| {
e.set_text_content(Some("Hello from Rust!"));
Some(())
});
}
// Access DOM
#[wasm_bindgen]
pub fn modify_dom() {
let window = web_sys::window().expect("no window");
let document = window.document().expect("no document");
let body = document.body().expect("no body");
let div = document.create_element("div").unwrap();
div.set_text_content(Some("Generated by Rust"));
body.append_child(&div).unwrap();
}
// Event handling
use wasm_bindgen::JsCast;
use web_sys::HtmlInputElement;
#[wasm_bindgen]
pub fn setup_click_handler() {
let window = web_sys::window().unwrap();
let document = window.document().unwrap();
if let Some(button) = document.get_element_by_id("btn") {
let closure = Closure::wrap(Box::new(|_: web_sys::Event| {
web_sys::console::log_1(&"Button clicked!".into());
}) as Box);
button
.add_event_listener_with_callback("click", closure.as_ref().unchecked_ref())
.unwrap();
closure.forget();
}
}
// Return JavaScript objects
#[wasm_bindgen]
pub fn create_js_object() -> wasm_bindgen::JsValue {
use js_sys::Object;
let obj = Object::new();
js_sys::Reflect::set(&obj, &"name".into(), &"Rust".into()).unwrap();
js_sys::Reflect::set(&obj, &"version".into(), &1.into()).unwrap();
obj.into()
}
// Promise support (async)
use wasm_bindgen_futures::JsFuture;
#[wasm_bindgen]
pub async fn fetch_data(url: &str) -> Result<String, JsValue> {
let window = web_sys::window().ok_or_else(|| "no window".into())?;
let resp_value = JsFuture::from(window.fetch_with_str(url))
.await?;
let resp: web_sys::Response = resp_value.dyn_into()?;
let text = JsFuture::from(resp.text()?).await?;
Ok(text.as_string().unwrap_or_default())
}
wasm-wasi & System Calls
// wasm32-wasi target enables system-like behavior
// rustup target add wasm32-wasi
// Cargo.toml
[build]
target = "wasm32-wasi"
[dependencies]
# Can use std now!
// src/main.rs — can use std::io, std::fs, etc.
use std::fs;
use std::io::{self, Read};
fn main() {
// WASI provides file I/O
let data = fs::read_to_string("input.txt").unwrap();
println!("Read: {}", data);
}
// Build:
// cargo build --target wasm32-wasi --release
// Run with wasmtime:
// wasmtime target/wasm32-wasi/release/my_app.wasm
// Available WASI capabilities:
// - File I/O (limited to allowed directories)
// - Environment variables (std::env)
// - Command-line arguments (std::env::args)
// - Time (std::time)
// - Process exit codes
// Example: file processing
fn process_file(path: &str) -> io::Result<String> {
let mut file = fs::File::open(path)?;
let mut contents = String::new();
file.read_to_string(&mut contents)?;
Ok(contents.to_uppercase())
}
// WASI runtime security (sandboxed execution)
// Files accessible only within granted directories
// Network access disabled by default
// System calls restricted
Performance Considerations
// WASM performance tips
// 1. Minimize memory allocations
#[wasm_bindgen]
pub fn process_good(data: &[u8]) -> Vec<u8> {
// Reuse input buffer, single output allocation
data.iter().map(|&b| b * 2).collect()
}
#[wasm_bindgen]
pub fn process_bad(data: &[u8]) -> Vec<u8> {
// Multiple allocations: intermediate vecs
let doubled: Vec<u8> = data.iter().map(|&b| b * 2).collect();
let filtered: Vec<u8> = doubled.into_iter().filter(|&b| b > 10).collect();
filtered
}
// 2. Batch JavaScript interop
#[wasm_bindgen]
pub fn batch_calculation(inputs: &[i32]) -> Vec<i32> {
// Process all at once, return results
// Avoid calling back to JS in a loop
inputs.iter().map(|x| x * 2).collect()
}
// DON'T do this:
// #[wasm_bindgen]
// pub fn slow_calculation(inputs: &[i32]) -> Vec<i32> {
// let mut results = Vec::new();
// for x in inputs {
// let processed = web_sys::expensive_js_call(x);
// results.push(processed);
// }
// results
// }
// 3. Use typed arrays
#[wasm_bindgen]
pub fn fill_buffer(buf: &mut [u8], value: u8) {
buf.fill(value);
}
// JavaScript:
// const buf = new Uint8Array(1000);
// fillBuffer(buf, 42);
// 4. Avoid string conversion overhead
#[wasm_bindgen]
pub fn process_bytes(data: &[u8]) -> Vec<u8> {
// Work with bytes directly
data.iter().map(|&b| b.wrapping_add(1)).collect()
}
// 5. Profile with wasm-opt
// cargo install wasm-opt
// wasm-opt -Oz output.wasm -o output.wasm
// Binary size optimization
// Cargo.toml:
// [profile.release]
// opt-level = "z" # Optimize for size
// lto = true
// codegen-units = 1
Component Model (Future)
// WebAssembly Component Model — cross-language composition
// (Experimental, future direction)
// Define interfaces
// interface.wit
package example:hello;
interface hello {
greet: func(name: string) -> string;
}
world my-world {
export hello;
}
// Compile:
// wasm-tools component bind output.wasm -o component.wasm
// Components allow:
// - Language-neutral interfaces
// - Cross-language WASM composition
// - System-like capabilities
// - Better tooling and type safety
// This is the future of WASM standardization
34. Networking & Async I/O
tokio Runtime Deep Dive
use tokio::runtime::Runtime;
use std::time::Duration;
// Create runtime
fn main() {
let rt = Runtime::new().unwrap();
rt.block_on(async {
println!("Running async code");
});
}
// Builder for custom configuration
fn custom_runtime() {
let rt = tokio::runtime::Builder::new_multi_thread()
.worker_threads(4)
.thread_name("my-worker")
.stack_size(2 * 1024 * 1024) // 2MB stack
.enable_all() // Enable all features
.build()
.unwrap();
rt.block_on(async {
// Async work
});
}
// Single-threaded runtime (for simple apps)
fn single_thread() {
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
.unwrap();
rt.block_on(async {
// All tasks run on single thread
});
}
// Work stealing scheduler
// - Multiple threads (worker pool)
// - Each thread has a task queue
// - Idle threads steal from busy threads
// - Excellent cache locality
// - Fair task distribution
// Task spawning
async fn main_async() {
// Spawn task on runtime
let handle = tokio::spawn(async {
println!("Task running");
42
});
// Wait for result
let result = handle.await.unwrap();
println!("Result: {}", result);
}
// Blocking code in async context
async fn io_bound() {
// For actual I/O (network, disk), use async APIs
// For CPU-bound code, use tokio::task::block_in_place
let result = tokio::task::block_in_place(|| {
// Expensive computation or blocking call
(0..1_000_000).sum::<i32>()
});
println!("Result: {}", result);
}
// Keeping runtime alive
struct AppState {
runtime: Runtime,
}
impl AppState {
fn new() -> Self {
AppState {
runtime: Runtime::new().unwrap(),
}
}
fn run<F>(&mut self, future: F) -> F::Output
where
F: std::future::Future,
{
self.runtime.block_on(future)
}
}
TCP/UDP with tokio
use tokio::net::{TcpListener, TcpStream, UdpSocket};
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use std::net::SocketAddr;
// TCP Server
async fn tcp_server() -> tokio::io::Result<()> {
let listener = TcpListener::bind("127.0.0.1:8080").await?;
loop {
let (socket, peer_addr) = listener.accept().await?;
println!("Accepted: {}", peer_addr);
tokio::spawn(async move {
if let Err(e) = handle_connection(socket).await {
eprintln!("Error: {}", e);
}
});
}
}
async fn handle_connection(mut socket: TcpStream) -> tokio::io::Result<()> {
let mut buf = [0; 1024];
loop {
let n = socket.read(&mut buf).await?;
if n == 0 {
return Ok(()); // Connection closed
}
let data = &buf[..n];
println!("Received: {:?}", String::from_utf8_lossy(data));
socket.write_all(b"ACK").await?;
}
}
// TCP Client
async fn tcp_client() -> tokio::io::Result<()> {
let mut stream = TcpStream::connect("127.0.0.1:8080").await?;
stream.write_all(b"Hello, server!").await?;
let mut buf = [0; 1024];
let n = stream.read(&mut buf).await?;
println!("Response: {}", String::from_utf8_lossy(&buf[..n]));
Ok(())
}
// UDP Socket
async fn udp_server() -> tokio::io::Result<()> {
let socket = UdpSocket::bind("127.0.0.1:9000").await?;
let mut buf = [0; 1024];
loop {
let (n, peer) = socket.recv_from(&mut buf).await?;
println!("UDP from {}: {:?}", peer, String::from_utf8_lossy(&buf[..n]));
socket.send_to(b"ACK", peer).await?;
}
}
// Multiple concurrent connections
async fn handle_many_connections() {
let listener = TcpListener::bind("127.0.0.1:8080")
.await
.unwrap();
let mut interval = tokio::time::interval(std::time::Duration::from_secs(1));
loop {
tokio::select! {
result = listener.accept() => {
if let Ok((socket, addr)) = result {
println!("New: {}", addr);
tokio::spawn(async move {
let _ = handle_connection(socket).await;
});
}
}
_ = interval.tick() => {
println!("Tick");
}
}
}
}
Tower Middleware
use tower::{Service, ServiceExt, service_fn};
use std::task::{Context, Poll};
use std::pin::Pin;
use std::future::Future;
// Custom middleware as a layer
struct LoggingService<S> {
inner: S,
}
impl<S> Service<String> for LoggingService<S>
where
S: Service<String> + Send + 'static,
S::Future: Send + 'static,
{
type Response = S::Response;
type Error = S::Error;
type Future = Pin<Box<dyn Future<Output = Result<S::Response, S::Error>> + Send>>;
fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
self.inner.poll_ready(cx)
}
fn call(&mut self, req: String) -> Self::Future {
println!("Handling: {}", req);
let future = self.inner.call(req);
Box::pin(async move {
let response = future.await?;
println!("Done");
Ok(response)
})
}
}
// Layer to wrap services
use tower::Layer;
struct LoggingLayer;
impl<S> Layer<S> for LoggingLayer {
type Service = LoggingService<S>;
fn layer(&self, inner: S) -> Self::Service {
LoggingService { inner }
}
}
// Composing multiple middleware
use tower::layer::Stack;
async fn middleware_example() {
let service = service_fn(|req: String| async move {
Ok::
Hyper & Axum Web Frameworks
// Hyper — low-level HTTP
use hyper::{Server, Body, Request, Response, StatusCode};
use hyper::service::{service_fn, make_service_fn};
async fn handle(_req: Request<Body>) -> Result<Response<Body>, hyper::Error> {
Ok(Response::new(Body::from("Hello, World!")))
}
async fn hyper_server() {
let make_svc = make_service_fn(|_conn| async {
Ok::<_, hyper::Error>(service_fn(handle))
});
let addr = ([127, 0, 0, 1], 3000).into();
let server = Server::bind(&addr).serve(make_svc);
println!("Listening on http://{}", addr);
server.await.unwrap();
}
// Axum — high-level framework
use axum::{
routing::{get, post},
Router,
Json,
extract::Path,
};
use serde_json::json;
// Define routes
async fn root() -> &'static str {
"Hello!"
}
async fn user(Path(id): Path<u32>) -> Json<serde_json::Value> {
Json(json!({
"id": id,
"name": "User"
}))
}
async fn create_user(Json(payload): Json<serde_json::Value>) -> Json<serde_json::Value> {
Json(payload)
}
async fn axum_server() {
let app = Router::new()
.route("/", get(root))
.route("/user/:id", get(user))
.route("/user", post(create_user));
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.unwrap();
axum::serve(listener, app).await.unwrap();
}
// Extractors (parse request data)
use axum::extract::{Query, Headers};
async fn search(
Query(params): Query<std::collections::HashMap<String, String>>,
) -> String {
format!("Query: {:?}", params)
}
// Middleware in Axum
use tower::ServiceBuilder;
async fn with_middleware() {
let app = Router::new()
.route("/", get(root))
.layer(
ServiceBuilder::new()
.layer(tower_http::trace::TraceLayer::new_for_http())
);
}
Connection Pooling
use tokio::net::TcpStream;
use std::sync::Arc;
use tokio::sync::Semaphore;
use std::collections::VecDeque;
// Simple connection pool
struct ConnectionPool {
available: Arc<Semaphore>,
connections: Arc<tokio::sync::Mutex<VecDeque<TcpStream>>>,
max_size: usize,
}
impl ConnectionPool {
fn new(max_size: usize) -> Self {
ConnectionPool {
available: Arc::new(Semaphore::new(max_size)),
connections: Arc::new(tokio::sync::Mutex::new(VecDeque::new())),
max_size,
}
}
async fn get(&self, addr: &str) -> tokio::io::Result<PooledConnection> {
self.available.acquire().await.unwrap();
let mut conns = self.connections.lock().await;
let stream = if let Some(conn) = conns.pop_front() {
conn
} else {
TcpStream::connect(addr).await?
};
Ok(PooledConnection {
stream: Some(stream),
pool: self.available.clone(),
})
}
}
// RAII connection guard
struct PooledConnection {
stream: Option<TcpStream>,
pool: Arc<Semaphore>,
}
impl Drop for PooledConnection {
fn drop(&mut self) {
self.pool.add_permits(1);
}
}
// Using deadpool crate (recommended)
use deadpool::managed::{Pool, Object};
async fn with_deadpool() {
// let pool = Pool::builder(...)
// .max_size(10)
// .build()
// .unwrap();
//
// let conn = pool.get().await.unwrap();
// // Use conn
// // Returned to pool on drop
}
// sqlx with connection pool (database)
async fn db_pool() {
let pool = sqlx::postgres::PgPoolOptions::new()
.max_connections(5)
.connect("postgres://user:pass@localhost/db")
.await
.unwrap();
// Pool manages connections automatically
let row = sqlx::query_as::<_, (i32,)>("SELECT 1")
.fetch_one(&pool)
.await
.unwrap();
}
Graceful Shutdown
use tokio::sync::broadcast;
use tokio::signal;
async fn graceful_shutdown() {
let (tx, mut rx) = broadcast::channel(1);
// Spawn server task
let server_handle = tokio::spawn(async move {
let listener = tokio::net::TcpListener::bind("127.0.0.1:8080")
.await
.unwrap();
loop {
tokio::select! {
result = listener.accept() => {
if let Ok((socket, _)) = result {
let mut rx = rx.subscribe();
tokio::spawn(async move {
handle_client(socket, &mut rx).await;
});
}
}
_ = rx.recv() => {
println!("Shutting down server");
break;
}
}
}
});
// Wait for signal
signal::ctrl_c().await.unwrap();
println!("Shutdown signal received");
// Broadcast shutdown
let _ = tx.send(());
// Wait for server to finish
server_handle.await.unwrap();
}
async fn handle_client(
mut socket: tokio::net::TcpStream,
shutdown: &mut broadcast::Receiver<()>,
) {
let mut buf = [0; 1024];
loop {
tokio::select! {
result = socket.read(&mut buf) => {
match result {
Ok(0) => break, // Connection closed
Ok(n) => {
let _ = socket.write_all(&buf[..n]).await;
}
Err(_) => break,
}
}
_ = shutdown.recv() => {
println!("Client shutdown");
break;
}
}
}
}
// With axum
use axum::Router;
use tower::ServiceBuilder;
async fn axum_graceful_shutdown() {
let app = Router::new();
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.unwrap();
let (tx, rx) = tokio::sync::oneshot::channel();
// Spawn with graceful shutdown
tokio::spawn(async move {
axum::serve(listener, app)
.with_graceful_shutdown(async {
let _ = rx.await;
})
.await
.unwrap();
});
// Later: signal shutdown
let _ = tx.send(());
}
35. Advanced Type System Patterns
Sealed Traits
// Sealed trait — prevent external implementations
pub trait MyTrait: sealed::Sealed {
fn do_something(&self);
}
mod sealed {
pub trait Sealed {}
}
// Only types in this module can implement MyTrait
struct MyImpl;
impl sealed::Sealed for MyImpl {}
impl MyTrait for MyImpl {
fn do_something(&self) {
println!("Something");
}
}
// Users cannot do: impl MyTrait for TheirType
// Because TheirType cannot implement the private sealed::Sealed trait
// More sophisticated sealing
pub trait ApiFunction: sealed::ApiSealed {
fn call(&self);
}
mod sealed {
use std::marker::PhantomData;
pub trait ApiSealed {}
struct Private;
impl ApiSealed for super::Public1 {}
impl ApiSealed for super::Public2 {}
// Only Public1 and Public2 can implement ApiFunction
}
pub struct Public1;
pub struct Public2;
pub struct Public3; // Cannot implement ApiFunction
impl sealed::ApiSealed for Public1 {}
impl ApiFunction for Public1 {
fn call(&self) {}
}
impl sealed::ApiSealed for Public2 {}
impl ApiFunction for Public2 {
fn call(&self) {}
}
Extension Traits
// Add methods to existing types without modifying them
trait IteratorExt: Iterator {
fn my_custom_operation(self) -> Vec<Self::Item>;
}
impl<I: Iterator> IteratorExt for I {
fn my_custom_operation(self) -> Vec<Self::Item> {
self.collect()
}
}
fn use_extension() {
let result = [1, 2, 3].iter()
.map(|&x| x * 2)
.my_custom_operation();
println!("{:?}", result);
}
// Real-world example: extending Result
trait ResultExt<T, E> {
fn context(self, msg: &str) -> Result<T, String>;
}
impl<T, E: std::fmt::Display> ResultExt<T, E> for Result<T, E> {
fn context(self, msg: &str) -> Result<T, String> {
self.map_err(|e| format!("{}: {}", msg, e))
}
}
fn file_operation() -> Result<String, String> {
std::fs::read_to_string("file.txt")
.context("Failed to read file")
}
// Extending String with custom methods
trait StringExt {
fn reverse_words(&self) -> String;
}
impl StringExt for String {
fn reverse_words(&self) -> String {
self.split_whitespace()
.rev()
.collect::<Vec<_>>()
.join(" ")
}
}
fn reverse_string() {
let s = "hello world".to_string();
println!("{}", s.reverse_words()); // "world hello"
}
// Extension traits with generic methods
trait SliceExt<T> {
fn split_at_predicate<F>(&self, predicate: F) -> Vec<&[T]>
where
F: Fn(&T) -> bool;
}
impl<T> SliceExt<T> for [T] {
fn split_at_predicate<F>(&self, predicate: F) -> Vec<&[T]>
where
F: Fn(&T) -> bool,
{
let mut result = Vec::new();
let mut start = 0;
for (i, item) in self.iter().enumerate() {
if predicate(item) {
if start < i {
result.push(&self[start..i]);
}
start = i + 1;
}
}
if start < self.len() {
result.push(&self[start..]);
}
result
}
}
Tower of Trait Bounds
// Incrementally restrictive trait bounds
trait Level0 {
fn basic(&self);
}
trait Level1: Level0 {
fn more(&self);
}
trait Level2: Level1 {
fn advanced(&self);
}
trait Level3: Level2 {
fn extreme(&self);
}
// Implementing at Level2 gains Level0, Level1, and Level2
struct MyType;
impl Level0 for MyType {
fn basic(&self) {}
}
impl Level1 for MyType {
fn more(&self) {}
}
impl Level2 for MyType {
fn advanced(&self) {}
}
// Real-world: std trait hierarchy
// PartialEq → Eq
// PartialOrd → Ord
// Copy → Clone
// Generic with escalating requirements
fn level0<T: Level0>(t: &T) {
t.basic();
}
fn level1<T: Level1>(t: &T) {
t.basic();
t.more();
}
fn level2<T: Level2>(t: &T) {
t.basic();
t.more();
t.advanced();
}
// Practical: iteration abstractions
use std::iter::Iterator;
fn simple_iter<I: Iterator>(iter: I) {
// Iterator only
}
fn exact_iter<I: ExactSizeIterator>(iter: I) {
// ExactSizeIterator ⊇ Iterator
// Can call len()
}
fn double_iter<I: DoubleEndedIterator>(iter: I) {
// Can iterate from both ends
}
// Combining multiple hierarchies
fn advanced_process<T>(t: &T)
where
T: Level2 + Clone + Send + Sync,
{
t.basic();
t.clone();
// Can be sent across threads
}
Existential Types
// Type-erased return values
fn returns_iterator(kind: &str) -> Box<dyn Iterator<Item = i32>> {
match kind {
"range" => Box::new(0..10),
"vec" => Box::new(vec![1, 2, 3].into_iter()),
_ => Box::new(std::iter::empty()),
}
}
// Different concrete types, same trait object
fn use_iterator() {
let iter = returns_iterator("range");
for item in iter {
println!("{}", item);
}
}
// Existential with associated types
trait Producer {
type Output;
fn produce(&self) -> Self::Output;
}
struct IntProducer;
impl Producer for IntProducer {
type Output = i32;
fn produce(&self) -> i32 { 42 }
}
struct StringProducer;
impl Producer for StringProducer {
type Output = String;
fn produce(&self) -> String { "hello".to_string() }
}
// Can't directly erase associated types in trait objects
// Must use workarounds:
trait ErasedProducer {
fn produce_box(&self) -> Box<dyn std::any::Any>;
}
impl ErasedProducer for IntProducer {
fn produce_box(&self) -> Box<dyn std::any::Any> {
Box::new(self.produce())
}
}
impl ErasedProducer for StringProducer {
fn produce_box(&self) -> Box<dyn std::any::Any> {
Box::new(self.produce())
}
}
// Or use impl Trait (existential without dyn)
fn returns_impl_trait(kind: &str) -> impl Iterator<Item = i32> {
0..10
// Return type determined at compile time
// But hidden from callers
}
Type-Erased Containers
use std::any::{Any, TypeId};
use std::collections::HashMap;
// Heterogeneous container
struct TypeMap {
inner: HashMap<TypeId, Box<dyn Any>>,
}
impl TypeMap {
fn new() -> Self {
TypeMap {
inner: HashMap::new(),
}
}
fn insert<T: 'static>(&mut self, value: T) {
self.inner.insert(TypeId::of::<T>(), Box::new(value));
}
fn get<T: 'static>(&self) -> Option<&T> {
self.inner
.get(&TypeId::of::<T>())
.and_then(|v| v.downcast_ref::<T>())
}
fn get_mut<T: 'static>(&mut self) -> Option<&mut T> {
self.inner
.get_mut(&TypeId::of::<T>())
.and_then(|v| v.downcast_mut::<T>())
}
}
fn use_type_map() {
let mut map = TypeMap::new();
map.insert(42i32);
map.insert("hello".to_string());
map.insert(3.14f64);
assert_eq!(map.get::<i32>(), Some(&42));
assert_eq!(map.get::<String>(), Some(&"hello".to_string()));
assert_eq!(map.get::<f64>(), Some(&3.14));
assert_eq!(map.get::<bool>(), None);
}
// With trait dispatch
struct TraitMap {
inner: HashMap<String, Box<dyn TraitMethod>>,
}
trait TraitMethod {
fn call(&self);
}
impl TraitMethod for i32 {
fn call(&self) {
println!("i32: {}", self);
}
}
impl TraitMethod for String {
fn call(&self) {
println!("String: {}", self);
}
}
fn use_trait_map() {
let mut map = TraitMap {
inner: HashMap::new(),
};
map.inner.insert("int".to_string(), Box::new(42i32));
map.inner.insert("str".to_string(), Box::new("hello".to_string()));
for (_, trait_obj) in &map.inner {
trait_obj.call();
}
}
Marker Types & Unit Structs
// Marker types encode information in the type system
struct Red;
struct Blue;
struct Green;
struct Pixel<Color> {
r: u8,
g: u8,
b: u8,
_color: std::marker::PhantomData<Color>,
}
// Type-safe color handling
impl Pixel<Red> {
fn to_red_channel(&self) -> u8 {
self.r
}
}
impl Pixel<Green> {
fn to_green_channel(&self) -> u8 {
self.g
}
}
// Typestate pattern with markers
struct Locked;
struct Unlocked;
struct Lock<State> {
data: String,
_state: std::marker::PhantomData<State>,
}
impl Lock<Locked> {
fn new(data: String) -> Self {
Lock {
data,
_state: std::marker::PhantomData,
}
}
fn unlock(self) -> Lock<Unlocked> {
Lock {
data: self.data,
_state: std::marker::PhantomData,
}
}
}
impl Lock<Unlocked> {
fn lock(self) -> Lock<Locked> {
Lock {
data: self.data,
_state: std::marker::PhantomData,
}
}
fn read(&self) -> &str {
&self.data
}
}
fn typestate_example() {
let locked = Lock::new("secret".to_string());
// locked.read(); // ERROR: can't read when locked
let unlocked = locked.unlock();
println!("{}", unlocked.read()); // OK
let locked_again = unlocked.lock();
// locked_again.read(); // ERROR again
}
// Compile-time state machines
trait State {}
struct Init;
struct Running;
struct Done;
impl State for Init {}
impl State for Running {}
impl State for Done {}
struct Machine<S: State> {
state: std::marker::PhantomData<S>,
}
impl Machine<Init> {
fn start(self) -> Machine<Running> {
Machine {
state: std::marker::PhantomData,
}
}
}
impl Machine<Running> {
fn finish(self) -> Machine<Done> {
Machine {
state: std::marker::PhantomData,
}
}
}
Compile-Time State Machines (Typestate)
use std::marker::PhantomData;
// Builder pattern with compile-time guarantees
struct Builder<T1 = NoField, T2 = NoField, T3 = NoField> {
field1: Option<String>,
field2: Option<i32>,
field3: Option<bool>,
_marker: PhantomData<(T1, T2, T3)>,
}
struct NoField;
struct HasField;
impl Default for Builder {
fn default() -> Self {
Builder {
field1: None,
field2: None,
field3: None,
_marker: PhantomData,
}
}
}
impl Builder<NoField, NoField, NoField> {
fn with_field1(mut self, f: String) -> Builder<HasField, NoField, NoField> {
self.field1 = Some(f);
Builder {
field1: self.field1,
field2: self.field2,
field3: self.field3,
_marker: PhantomData,
}
}
}
impl Builder<HasField, NoField, NoField> {
fn with_field2(mut self, f: i32) -> Builder<HasField, HasField, NoField> {
self.field2 = Some(f);
Builder {
field1: self.field1,
field2: self.field2,
field3: self.field3,
_marker: PhantomData,
}
}
}
impl Builder<HasField, HasField, NoField> {
fn with_field3(mut self, f: bool) -> Builder<HasField, HasField, HasField> {
self.field3 = Some(f);
Builder {
field1: self.field1,
field2: self.field2,
field3: self.field3,
_marker: PhantomData,
}
}
}
struct MyStruct {
field1: String,
field2: i32,
field3: bool,
}
impl Builder<HasField, HasField, HasField> {
fn build(self) -> MyStruct {
MyStruct {
field1: self.field1.unwrap(),
field2: self.field2.unwrap(),
field3: self.field3.unwrap(),
}
}
}
fn typestate_builder() {
let obj = Builder::default()
.with_field1("hello".to_string())
.with_field2(42)
.with_field3(true)
.build();
println!("{:?}", obj);
// This would NOT compile:
// let incomplete = Builder::default()
// .with_field1("hello".to_string())
// .build(); // ERROR: missing field2 and field3
}
36. Serde & Serialization Intermediate
How Serde Works
Serde (Serialize/Deserialize) is the de facto serialization framework. It decouples data formats (JSON, TOML, YAML, MessagePack, CBOR) from serialization logic using two traits: Serialize and Deserialize. The framework uses a "data model" abstraction so you write serialization code once and it works with any format.
// Serialize trait — your type says "here's how to serialize me"
pub trait Serialize {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// implementation
}
}
// Deserialize trait — your type says "here's how to deserialize me"
pub trait Deserialize<'de> {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
// implementation
}
}
Derive Macros & Common Attributes
use serde::{Serialize, Deserialize};
// Basic derive — generates Serialize/Deserialize implementations
#[derive(Serialize, Deserialize, Debug)]
struct User {
pub id: u32,
pub name: String,
pub email: String,
}
// #[serde(rename)] — serialize field with different name
#[derive(Serialize, Deserialize)]
struct Person {
pub id: u32,
#[serde(rename = "full_name")]
pub name: String,
}
// #[serde(skip)] — don't serialize/deserialize this field
#[derive(Serialize, Deserialize)]
struct User {
pub name: String,
#[serde(skip)]
internal_state: u32,
}
// #[serde(default)] — use Default::default() if field missing
#[derive(Serialize, Deserialize)]
struct Config {
pub port: u16,
#[serde(default)]
pub timeout: u64,
}
// #[serde(flatten)] — inline fields from nested struct
#[derive(Serialize, Deserialize)]
struct Extended {
pub id: u32,
#[serde(flatten)]
pub extra: Extra,
}
// #[serde(rename_all)] — rename all fields (camelCase, snake_case, etc.)
#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct ApiRequest {
pub user_id: u32, // becomes "userId" in JSON
pub request_type: String, // becomes "requestType"
}
// #[serde(deny_unknown_fields)] — error on extra fields during deserialization
#[derive(Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
struct StrictConfig {
pub name: String,
}
Enum Serialization Strategies
| Strategy | Name | JSON Example | Use Case |
|---|---|---|---|
| Externally Tagged (default) | #[serde(tag = "type")] | {"Variant": {...}} | Clear variant name in JSON |
| Internally Tagged | #[serde(tag = "type")] | {"type": "Variant", ...fields...} | Flatter JSON structure |
| Adjacently Tagged | #[serde(tag = "t", content = "c")] | {"t": "Variant", "c": {...}} | Balance between clarity and size |
| Untagged | #[serde(untagged)] | Variant data only | Infer from structure |
#[derive(Serialize, Deserialize)]
enum Status {
#[serde(rename = "active")]
Active,
#[serde(rename = "inactive")]
Inactive,
}
Serde Data Formats
| Format | Crate | Compact | Human-Readable | Best For |
|---|---|---|---|---|
| JSON | serde_json | Good | Yes | Web APIs, config |
| TOML | toml | Fair | Yes | Config files |
| YAML | serde_yaml | Fair | Yes | Human config, Kubernetes |
| MessagePack | rmp-serde | Excellent | No | Binary RPC, storage |
| CBOR | serde_cbor | Excellent | No | Binary protocols |
| Bincode | bincode | Excellent | No | Fast binary serialization |
Custom Serializer & Deserializer
use serde::{Serializer, Deserializer};
// Custom serialization for DateTime — serialize as ISO 8601 string
mod datetime_format {
use serde::{self, Serializer, Deserializer};
use chrono::DateTime;
pub fn serialize<S>(date: &DateTime<Utc>, s: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let date_str = date.to_rfc3339();
s.serialize_str(&date_str)
}
pub fn deserialize<'de, D>(d: D) -> Result<DateTime<Utc>, D::Error>
where
D: Deserializer<'de>,
{
let date_str = String::deserialize(d)?;
DateTime::parse_from_rfc3339(&date_str)
.map(|dt| dt.with_timezone(&Utc))
.map_err(serde::de::Error::custom)
}
}
#[derive(Serialize, Deserialize)]
struct Event {
#[serde(with = "datetime_format")]
pub timestamp: DateTime<Utc>,
}
Zero-Copy Deserialization
Use borrowed data types to avoid allocations during deserialization:
use std::borrow::Cow;
// Instead of String, use &'de str to borrow from the input
#[derive(Deserialize)]
struct User<'de> {
pub name: &'de str, // borrowed — no allocation!
pub email: &'de str,
}
// Or use Cow for conditional ownership
#[derive(Deserialize)]
struct FlexibleUser<'de> {
pub name: Cow<'de, str>, // Borrowed or Owned
}
serde_with for Complex Transformations
use serde_with::{serde_as, DisplayFromStr};
// Serialize number as string and back
#[serde_as]
#[derive(Serialize, Deserialize)]
struct Data {
#[serde_as(as = "DisplayFromStr")]
pub id: u64, // Serialized as "12345" in JSON
}
Performance Tips
- Avoid allocations: Use borrowed data (
&'de str,&'de [u8]) when deserializing - Pre-allocate: Use
Vec::with_capacity()orString::with_capacity()for known sizes - Choose format wisely: Binary formats (MessagePack, bincode, CBOR) are faster than JSON
- Streaming: For large files, use streaming deserializers instead of loading all into memory
- Custom impl: For hot paths, implement
Serialize/Deserializemanually for fine-grained control
37. Collections Deep Dive Intermediate
Vec<T> Internals
Vec<T> is a growable array on the heap. Internally: ptr (to heap data) + len (elements used) + cap (allocated capacity). When capacity is exceeded, Vec reallocs with exponential growth (typically 1.5x or 2x).
let mut v = Vec::new(); // len=0, cap=0
v.push(1); // len=1, cap=1
v.push(2); // len=2, cap=2 (realloc!)
v.push(3); // len=3, cap=4 (realloc!)
// Pre-allocate if you know size
let mut v = Vec::with_capacity(100);
for i in 0..100 {
v.push(i); // No reallocs!
}
HashMap Internals & Robin Hood Hashing
HashMap<K, V> uses a Swiss table / hashbrown algorithm with Robin Hood hashing for cache-efficient lookups. Load factor typically stays 0-0.75; when exceeded, the table rehashes.
use std::collections::HashMap;
let mut map = HashMap::new();
map.insert("key", "value");
// Entry API — avoids double lookup
map.entry("key")
.or_insert_with(|| "default");
BTreeMap vs HashMap
| Property | HashMap | BTreeMap |
|---|---|---|
| Ordering | Unordered (hash-based) | Sorted by key |
| Insert/Lookup | O(1) average, O(n) worst | O(log n) guaranteed |
| Memory | More spread (poor cache) | Cache-friendly (B-tree nodes) |
| Range queries | Not efficient | Very efficient (range()) |
| Use when | Fast unordered lookup, many insertions | Sorted iteration, range queries, few hot lookups |
HashSet, BTreeSet, VecDeque, BinaryHeap
use std::collections::{HashSet, BTreeSet, VecDeque, BinaryHeap};
// HashSet — unordered, O(1) insert/lookup
let mut set = HashSet::new();
set.insert(1);
// BTreeSet — sorted, O(log n) insert/lookup
let mut bset = BTreeSet::new();
bset.insert(3);
bset.insert(1);
for x in &bset {
println!("{}", x); // 1, 3 (sorted)
}
// VecDeque — double-ended queue, FIFO/LIFO, O(1) push_front/back
let mut deq = VecDeque::new();
deq.push_back(1);
deq.push_front(0);
deq.pop_back();
// BinaryHeap — max-heap (or min via custom Ord)
let mut heap = BinaryHeap::new();
heap.push(3);
heap.push(1);
heap.push(2);
assert_eq!(heap.pop(), Some(3)); // Max first
LinkedList — Rarely Use This
LinkedList<T> is a doubly-linked list. Avoid it — it has poor cache locality, high memory overhead (two pointers per node), and slow iteration. Use Vec or VecDeque instead.
Entry API for HashMap
use std::collections::HashMap;
let mut map = HashMap::new();
// or_insert — avoid double lookup!
map.entry("key").or_insert(0);
// or_insert_with — lazy evaluation
map.entry("key")
.or_insert_with(|| expensive_computation());
// and_modify — update existing, insert if missing
map.entry("count")
.and_modify(|c| *c += 1)
.or_insert(1);
Custom Hash & Eq
use std::hash::{Hash, Hasher};
use std::collections::hash_map::DefaultHasher;
#[derive(Clone)]
struct CustomKey {
value: i32,
}
impl Hash for CustomKey {
fn hash<H: Hasher>(&self, state: &mut H) {
self.value.hash(state);
}
}
impl PartialEq for CustomKey {
fn eq(&self, other: &Self) -> bool {
self.value == other.value
}
}
impl Eq for CustomKey {}
Stack-Allocated Collections
| Crate | Type | Feature |
|---|---|---|
| smallvec | SmallVec<[T; N]> | Vec with N inline items; grows to heap |
| arrayvec | ArrayVec<[T; N]> | Fixed-capacity vec; no heap growth |
| tinyvec | TinyVec<[T; N]> | Like SmallVec, no unsafe code |
Complexity Summary
| Collection | Insert | Lookup | Remove | Iterate | Best For |
|---|---|---|---|---|---|
| Vec | O(1) amortized (end) | O(1) (index), O(n) (linear) | O(n) | O(n) | General purpose array |
| HashMap | O(1) avg | O(1) avg | O(1) avg | O(n) | Fast unordered lookup |
| BTreeMap | O(log n) | O(log n) | O(log n) | O(n) sorted | Ordered, range queries |
| VecDeque | O(1) (front/back) | O(1) (index) | O(1) (front/back) | O(n) | Queue/deque operations |
| BinaryHeap | O(log n) | O(1) (peek max) | O(log n) | O(n) unordered | Priority queue |
| HashSet | O(1) avg | O(1) avg | O(1) avg | O(n) unordered | Membership testing |
| BTreeSet | O(log n) | O(log n) | O(log n) | O(n) sorted | Ordered unique items |
Drain, Retain, Split Patterns
// Drain — consume elements while iterating
let mut v = vec![1, 2, 3, 4];
for x in v.drain(1..3) {
println!("{}", x); // 2, 3
}
// v is now [1, 4]
// Retain — keep elements matching predicate
let mut v = vec![1, 2, 3, 4];
v.retain(|x| x % 2 == 0);
// v is now [2, 4]
// Split off — move tail into new vec
let mut v = vec![1, 2, 3, 4];
let tail = v.split_off(2);
// v = [1, 2], tail = [3, 4]
38. String Types Comprehensive Intermediate
The String Zoo
| Type | Owned | UTF-8 | Length | Use Case |
|---|---|---|---|---|
String | Yes | Yes | Mutable | Owned Rust string, mutable |
&str | No (borrowed) | Yes | Fixed slice | String references, params |
&'static str | No | Yes | Fixed | String literals, no lifetime issues |
OsStr | No | No (platform-specific) | Slice | OS paths, command args |
OsString | Yes | No | Mutable | Owned OS strings |
CStr | No | No (null-terminated) | Slice | C interop, FFI |
CString | Yes | No | Variable | Owned C strings |
Path | No | No | Slice | Filesystem paths |
PathBuf | Yes | No | Variable | Owned filesystem paths |
Cow<str> | Conditional | Yes | Variable | Zero-copy or owned |
Bytes | Yes (Arc) | No | Fixed | Cheap byte clones |
BytesMut | Yes | No | Mutable | Efficient byte mutations |
Memory Layout
// String: [ptr (8) | len (8) | cap (8)] = 24 bytes on 64-bit
// Heap: [u8, u8, u8, ..., u8]
let s = String::from("hello");
println!("{}", s.len()); // 5
println!("{}", s.capacity()); // likely 5+
// &str: [ptr (8) | len (8)] = 16 bytes on 64-bit
// Does NOT own the data
let slice: &str = &s;
When to Use Each Type — Decision Tree
fn param_guidelines() {
// Function parameters: ALWAYS use &str (not &String!)
fn process(s: &str) {} // Good — accepts String, &str, &'static str
// Need to mutate? Use &mut String
fn append(s: &mut String) { s.push_str("!"); }
// Need ownership? Use String
fn owns(s: String) {}
// Filesystem path? Use &Path in params, PathBuf for ownership
fn read_file(p: &Path) {}
// Zero-copy when possible? Use &'de str in serde
// Or Cow<str> for conditional ownership
}
String Conversion Matrix
| From | To String | To &str | To OsStr | To Path |
|---|---|---|---|---|
| String | — | s.as_str() or &s | s.as_os_str() | s.as_ref() |
| &str | s.to_string() or to_owned() | — | OsStr::new(s) | Path::new(s) |
| OsString | s.into_string().ok() | s.to_str() | s.as_os_str() | s.as_path() |
| PathBuf | p.to_str().map(|s| s.to_string()) | p.to_str() | p.as_os_str() | p.as_path() |
UTF-8, char vs Byte Indexing
// UTF-8 guarantees: valid sequences only
let s = "🦀"; // 4 bytes, 1 char
println!("{}", s.len()); // 4 (bytes)
println!("{}", s.chars().count()); // 1 (chars)
// CANNOT index by char — only by bytes!
// let c = s[0]; // ERROR — crab is 4 bytes
// Correct: iterate by chars or use grapheme crate
for ch in s.chars() {
println!("{}", ch); // 🦀
}
// Byte slicing: must be on char boundaries
let hello = "hello";
let slice = &hello[0..3]; // "hel" (safe)
String Building Patterns
use std::fmt::Write;
// format! macro — convenient but slower
let s = format!("{} {}", "hello", "world");
// write! macro — efficient
let mut s = String::new();
write!(&mut s, "{} {}", "hello", "world").unwrap();
// push_str — no allocation for static strs
let mut s = String::new();
s.push_str("hello");
s.push(' ');
s.push_str("world");
// join — efficient concatenation
let v = vec!["a", "b", "c"];
let s = v.join("-"); // "a-b-c"
// concat! — compile-time concatenation
const GREETING: &str = concat!("hello", " ", "world");
Cow<str> for Zero-Copy Conditional Ownership
use std::borrow::Cow;
fn transform(s: &str) -> Cow<str> {
if s.contains("_") {
// Need to modify — allocate
Cow::Owned(s.replace("_", " "))
} else {
// No modification needed — borrow
Cow::Borrowed(s)
}
}
Performance Tips
- Use &str in signatures, not &String — enables deref coercion, zero-copy
- Pre-allocate when known:
String::with_capacity(100) - Avoid repeated to_string(): Use references or Cow
- Use write! over format! for hot paths
- Reuse strings with clear() instead of creating new:
s.clear(); s.push_str(...) - For many small strings: Consider SmallString or other stack-allocated types
Common Pitfalls
| Pitfall | What Happens | Fix |
|---|---|---|
Index into string: s[0] | Panic if multi-byte char | Use s.chars().next() or s[0..1] if ASCII |
Slice at non-boundary: &s[0..2] for 🦀 | Panic — invalid UTF-8 | Ensure slice boundaries align with char boundaries |
| Passing &String to &str param | Works via deref, but ugly style | Just pass &s — deref coercion handles it |
| Comparing String and &str directly | Works, but inefficient if called repeatedly | Use s.as_str() == other or just s == other |
39. Implementing Standard Traits Intermediate
Display & Debug
use std::fmt;
// Display — user-friendly representation
impl fmt::Display for Point {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "({}, {})", self.x, self.y)
}
}
// Debug — can derive
#[derive(Debug)]
struct Point { x: i32, y: i32 }
// Or implement manually
impl fmt::Debug for Point {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_struct("Point")
.field("x", &self.x)
.field("y", &self.y)
.finish()
}
}
From & Into, TryFrom & TryInto
use std::convert::{From, TryFrom};
// From — infallible conversion (gives Into for free!)
impl From<i32> for String {
fn from(n: i32) -> Self {
n.to_string()
}
}
let s = String::from(42);
let s: String = 42.into(); // Into from From!
// TryFrom — fallible conversion
impl TryFrom<&str> for i32 {
type Error = std::num::ParseIntError;
fn try_from(s: &str) -> Result<Self, Self::Error> {
s.parse()
}
}
AsRef & AsMut
// AsRef — borrow as another type (zero-copy)
fn check_url<T: AsRef<str>>(url: T) -> bool {
url.as_ref().starts_with("https")
}
check_url("https://example.com"); // &str
check_url(String::from("https://...")); // String
// AsMut — mutable borrow
fn reset<T: AsMut<[u8]>>(buf: T) {
let slice = buf.as_mut();
for byte in slice {
*byte = 0;
}
}
Deref & DerefMut
// Smart pointer pattern — auto-deref
use std::ops::{Deref, DerefMut};
struct MyBox<T>(T);
impl<T> Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl<T> DerefMut for MyBox<T> {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
Index & IndexMut
use std::ops::{Index, IndexMut};
impl Index<usize> for MyVec {
type Output = i32;
fn index(&self, idx: usize) -> &i32 {
&self.data[idx]
}
}
impl IndexMut<usize> for MyVec {
fn index_mut(&mut self, idx: usize) -> &mut i32 {
&mut self.data[idx]
}
}
Iterator & IntoIterator
// Iterator — step through items
impl Iterator for Counter {
type Item = u32;
fn next(&mut self) -> Option<Self::Item> {
if self.count < self.max {
self.count += 1;
Some(self.count)
} else {
None
}
}
fn size_hint(&self) -> (usize, Option<usize>) {
let remaining = self.max - self.count;
(remaining, Some(remaining))
}
}
// IntoIterator — convert type into iterator (enables for loops!)
impl IntoIterator for MyCollection {
type Item = i32;
type IntoIter = MyIterator;
fn into_iter(self) -> Self::IntoIter {
MyIterator { inner: self.data }
}
}
let col = MyCollection { ..Default::default() };
for item in col { // Calls IntoIterator::into_iter
println!("{}", item);
}
FromIterator (enabling .collect())
// Allows: iterator.collect::<Vec<_>>()
impl FromIterator<i32> for MyVec {
fn from_iter<T: IntoIterator<Item = i32>>(iter: T) -> Self {
let mut v = MyVec::new();
for item in iter {
v.push(item);
}
v
}
}
Clone & Copy
// Copy — bitwise memcpy (automatic, must be small)
#[derive(Copy, Clone)]
struct Point { x: i32, y: i32 }
let p1 = Point { x: 0, y: 0 };
let p2 = p1; // Copy — no move, p1 still valid
// Clone — deep copy (explicit)
let s1 = String::from("hello");
let s2 = s1.clone(); // Explicit clone
PartialEq, Eq, PartialOrd, Ord
// PartialEq — == and !=, can derive
#[derive(PartialEq)]
struct User { id: u32 }
// Eq — total equality (reflexive, symmetric, transitive)
// Mark that a == b is total (e.g., no NaN like f64)
#[derive(PartialEq, Eq)]
struct User { id: u32 }
// PartialOrd — ordering with None for incomparable values
// Ord — total ordering
#[derive(PartialOrd, Ord, PartialEq, Eq)]
struct User { id: u32 }
Hash
use std::hash::{Hash, Hasher};
use std::collections::hash_map::DefaultHasher;
#[derive(Hash)]
struct User { id: u32, name: String }
// Manual impl for custom hashing
impl Hash for User {
fn hash<H: Hasher>(&self, state: &mut H) {
self.id.hash(state); // Only hash id, ignore name
}
}
Default
// Derive or implement for sensible default values
#[derive(Default)]
struct Config {
timeout: u32, // 0
name: String, // empty String
debug: bool, // false
}
// Manual implementation
impl Default for Config {
fn default() -> Self {
Config { timeout: 30, name: String::new(), debug: false }
}
}
Drop
// Called when value is dropped (cleanup)
impl Drop for Connection {
fn drop(&mut self) {
println!("Closing connection");
// cleanup
}
}
Operator Overloading: Add, Sub, Mul
use std::ops::Add;
impl Add for Vector {
type Output = Vector;
fn add(self, other: Vector) -> Vector {
Vector {
x: self.x + other.x,
y: self.y + other.y,
}
}
}
FromStr (Parsing from Strings)
use std::str::FromStr;
impl FromStr for Point {
type Err = String;
fn from_str(s: &str) -> Result<Self, String> {
let parts: Vec<_> = s.split(",").collect();
if parts.len() != 2 {
return Err(format!("invalid format"));
}
Ok(Point { x: parts[0].parse()?, y: parts[1].parse()? })
}
}
let p: Point = "1,2".parse()?;
Error Trait (Custom Error Types)
use std::error::Error;
use std::fmt;
#[derive(Debug)]
enum MyError {
IoError(std::io::Error),
ParseError(String),
}
impl fmt::Display for MyError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
MyError::IoError(e) => write!(f, "IO error: {}", e),
MyError::ParseError(e) => write!(f, "Parse error: {}", e),
}
}
}
impl Error for MyError {}
Trait Implementation Guide
| Trait | Derive? | Purpose | Implement When |
|---|---|---|---|
| Display | No | User-friendly output | Custom human-readable format needed |
| Debug | Yes (usually) | Developer output | Default derive works; manual for custom |
| Clone | Yes | Deep copy | Type owns heap data |
| Copy | Yes | Bitwise copy (auto) | Only small stack-only types |
| Default | Yes (if possible) | Default values | Sensible default exists |
| PartialEq/Eq | Yes | Equality comparison | Type supports == |
| Ord/PartialOrd | Yes | Ordering | Type supports total ordering |
| Hash | Yes | For HashMap/HashSet | Type used as key |
| Iterator | No | Iteration protocol | Custom container type |
| IntoIterator | No | for loop support | Enable for loop over your type |
| From/Into | No | Type conversion | Natural conversion between types |
| Deref/DerefMut | No | Smart pointer auto-deref | Wrapping types (Box, Rc, etc.) |
| Drop | No | Cleanup on drop | Resource management needed |
40. Concurrency Primitives Deep Dive Advanced
Mutex<T> Internals & Poisoning
use std::sync::Mutex;
// lock() — blocks until available, returns MutexGuard
let m = Mutex::new(0);
let mut guard = m.lock().unwrap();
*guard = 42;
// guard dropped here — lock released
// try_lock() — non-blocking, returns Option/Result
if let Ok(mut guard) = m.try_lock() {
*guard = 100;
}
// Poisoning — if thread panics while holding lock, mutex is poisoned
// Subsequent lock() calls return Err. Use .unwrap() or .unwrap_or_else()
match m.lock() {
Ok(guard) => { /* use guard */ },
Err(poisoned) => {
let guard = poisoned.into_inner(); // recover
}
}
RwLock<T> — Reader-Writer Lock
Better than Mutex when reads >> writes. Multiple readers can hold lock simultaneously; writers get exclusive access.
use std::sync::RwLock;
let data = RwLock::new(vec![1, 2, 3]);
// Multiple readers — no blocking each other
let r1 = data.read().unwrap();
let r2 = data.read().unwrap();
println!("{:?}", *r1);
// Writer — exclusive access
let mut w = data.write().unwrap();
w.push(4);
Condvar (Condition Variables)
Wait/notify pattern for thread synchronization. More efficient than spin-loops.
use std::sync::{Mutex, Condvar};
use std::sync::Arc;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair_clone = pair.clone();
// Producer
std::thread::spawn(move || {
let (lock, cvar) = &*pair_clone;
let mut ready = lock.lock().unwrap();
*ready = true;
cvar.notify_one(); // Wake one waiter
});
// Consumer
let (lock, cvar) = &*pair;
let mut ready = lock.lock().unwrap();
// Wait until ready is true
while !*ready {
ready = cvar.wait(ready).unwrap();
}
Barrier — Synchronization Point
use std::sync::Barrier;
use std::sync::Arc;
let barrier = Arc::new(Barrier::new(3)); // Wait for 3 threads
for i in 0..3 {
let b = barrier.clone();
std::thread::spawn(move || {
println!("Thread {} waiting", i);
b.wait(); // Blocks until all reach here
println!("Thread {} proceed", i);
});
}
Once & OnceLock
use std::sync::Once;
use std::sync::OnceLock;
// Once — call closure only once (static initialization)
static INIT: Once = Once::new();
INIT.call_once(|| {
println!("Initialize only once");
});
// OnceLock — generic once-write cell
static CONFIG: OnceLock<String> = OnceLock::new();
CONFIG.get_or_init(|| load_config());
Atomic Types & Memory Ordering
Lock-free synchronization for simple types. Memory ordering controls CPU instruction optimization:
| Ordering | CPU Guarantees | Use Case | Performance |
|---|---|---|---|
| Relaxed | None — no synchronization | Statistics counters (approximate) | Fastest |
| Acquire | Acquire semantics — synchronizes-with Release | Acquire lock | Fast |
| Release | Release semantics — synchronizes-with Acquire | Release lock | Fast |
| AcqRel | Both Acquire and Release | Read-modify-write ops | Moderate |
| SeqCst | Full sequential consistency — slowest | Need total ordering (rare) | Slowest |
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
// Simple counter — Relaxed OK (stats don't need exact value)
let counter = AtomicUsize::new(0);
counter.fetch_add(1, Ordering::Relaxed);
// Flag for signaling — Release/Acquire needed
let flag = AtomicBool::new(false);
flag.store(true, Ordering::Release);
if flag.load(Ordering::Acquire) {
// Sees the store above
}
Compare-and-Swap (CAS) Loops
// Retry until successful (optimistic locking)
let val = AtomicUsize::new(0);
let mut current = val.load(Ordering::Relaxed);
loop {
let new = current + 1;
match val.compare_exchange(current, new, Ordering::Relaxed, Ordering::Relaxed) {
Ok(_) => break,
Err(actual) => current = actual, // Retry
}
}
Crossbeam — Lock-Free & Scoped Threads
use crossbeam::channel::unbounded;
use crossbeam::thread;
// Scoped threads — safe to borrow from outer scope
let data = vec![1, 2, 3];
thread::scope(|s| {
s.spawn(|_| {
println!("{:?}", data); // Borrows data safely
});
}).unwrap();
// Lock-free queue
use crossbeam::queue::SegQueue;
let q = SegQueue::new();
q.push(1);
q.pop();
Parking Lot — Faster Mutex/RwLock
Faster than std because no poisoning, smaller (1 byte vs std's larger), and better performance on high contention.
use parking_lot::Mutex;
let m = Mutex::new(0);
let mut guard = m.lock(); // Never panics — no poisoning
*guard = 42;
Thread-Local Storage
use std::cell::RefCell;
thread_local! {
static BUFFER: RefCell<Vec<u8>> = RefCell::new(Vec::with_capacity(1024));
}
let buf = BUFFER.with(|b| b.borrow_mut().len());
Deadlock Prevention
- Lock ordering: Always acquire locks in same order globally
- Timeouts: Use try_lock with timeout instead of blocking indefinitely
- One lock: Prefer single Mutex over multiple locks when possible
- Avoid nested locks: Never hold lock while acquiring another
- Tools: cargo-deadlock, loom for testing, parking_lot for better behavior
41. File I/O & Filesystem Intermediate
Read, Write, BufRead, Seek Traits
use std::io::{Read, Write, BufRead, Seek, SeekFrom};
// Read — read bytes into buffer
let mut buffer = [0; 512];
let n = file.read(&mut buffer)?; // Returns bytes read
// Write — write bytes from buffer
file.write_all("hello".as_bytes())?;
// BufRead — line-by-line reading
use std::io::BufReader;
let reader = BufReader::new(file);
for line in reader.lines() {
let line = line?;
println!("{}", line);
}
// Seek — move file pointer
file.seek(SeekFrom::Start(10))?; // Offset 10 from start
file.seek(SeekFrom::End(-10))?; // 10 bytes from end
File::open, File::create, OpenOptions
use std::fs::File;
use std::fs::OpenOptions;
// Read-only
let f = File::open("file.txt")?;
// Write-only (truncate)
let f = File::create("file.txt")?;
// OpenOptions builder — fine-grained control
let f = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.append(false)
.open("file.txt")?;
BufReader & BufWriter for Performance
use std::io::{BufReader, BufWriter};
// BufReader — buffers reads (reduces syscalls)
let file = File::open("large.txt")?;
let reader = BufReader::with_capacity(65536, file);
// BufWriter — buffers writes
let file = File::create("out.txt")?;
let mut writer = BufWriter::with_capacity(65536, file);
writer.write_all(b"data")?;
writer.flush()?; // Force write to disk
Reading Methods
use std::fs;
use std::io::Read;
// read_to_string — entire file into String
let contents = fs::read_to_string("file.txt")?;
// read_to_end — entire file into Vec<u8>
let contents = fs::read("file.txt")?;
// lines() — iterate line by line
use std::io::BufRead;
let file = File::open("file.txt")?;
let reader = BufReader::new(file);
for line in reader.lines() {
let line = line?;
// process line
}
// bytes() — iterate byte by byte
for byte in reader.bytes() {
let b = byte?;
}
Writing Methods
// write! macro — formatted write
use std::io::Write;
write!(&mut file, "x = {}\n", 42)?;
// writeln! macro
writeln!(&mut file, "hello")?;
// write_all — write entire buffer
file.write_all(b"data")?;
// fs::write — entire file at once
fs::write("file.txt", b"contents")?;
Path & PathBuf
use std::path::{Path, PathBuf};
// Path — immutable, slice-like
fn read_config(path: &Path) -> io::Result<String> {
fs::read_to_string(path)
}
// PathBuf — mutable, owned
let mut path = PathBuf::from("/home/user");
path.push("config"); // Now "/home/user/config"
path.set_extension("toml"); // ".../config.toml"
// components() — iterate path components
for comp in path.components() {
println!("{:?}", comp);
}
// File operations
let ext = path.extension(); // Some("toml")
let name = path.file_name(); // Some("config.toml")
std::fs Functions
| Function | Purpose |
|---|---|
fs::read_to_string(path) | Read entire file as String |
fs::read(path) | Read entire file as Vec<u8> |
fs::write(path, data) | Write entire file |
fs::copy(src, dst) | Copy file |
fs::rename(old, new) | Rename/move file |
fs::remove_file(path) | Delete file |
fs::create_dir(path) | Create single directory |
fs::create_dir_all(path) | Create directory and parents |
fs::remove_dir(path) | Remove empty directory |
fs::read_dir(path) | List directory entries |
fs::metadata(path) | Get file metadata (size, perms, etc.) |
Directory Walking
// Using walkdir crate (recommended)
use walkdir::WalkDir;
for entry in WalkDir::new(".")
.into_iter()
.filter_map(|e| e.ok()) {
let path = entry.path();
if path.extension() == Some(std::ffi::OsStr::new("rs")) {
println!("{}", path.display());
}
}
Async File I/O (tokio)
use tokio::fs;
#[tokio::main]
async fn main() {
// Async read — doesn't block runtime
let contents = fs::read_to_string("file.txt").await?;
// Async write
fs::write("out.txt", "data").await?;
}
Common Patterns
- Config file reading: Use fs::read_to_string + serde_json/toml parsing
- CSV processing: Use csv crate with BufReader
- Log rotation: Open new file daily, delete old files
- Error handling: Always use ? to propagate io::Error
- Large files: Use BufReader/BufWriter + BufRead::lines()
- Binary files: Use Read trait directly with fixed-size buffers
42. Database & ORM Patterns Intermediate
sqlx — Compile-Time Checked Queries
Type-safe SQL with compile-time verification. Supports PostgreSQL, MySQL, SQLite, MSSQL.
use sqlx::postgres::PgPool;
use sqlx::{FromRow, query, query_as};
// Define model with FromRow derive
#[derive(FromRow)]
struct User {
id: i32,
name: String,
email: String,
}
#[tokio::main]
async fn main() {
// Connection pool
let pool = PgPool::connect("postgres://user:pass@localhost/db")
.await
.unwrap();
// query_as! — compile-time checked, returns User
let user = query_as::<_, User>(
"SELECT id, name, email FROM users WHERE id = $1"
)
.bind(1)
.fetch_one(&pool)
.await?;
// query_as! returns Vec for multiple rows
let users = query_as::<_, User>(
"SELECT * FROM users WHERE age > $1"
)
.bind(18)
.fetch_all(&pool)
.await?;
}
Connection Pooling (sqlx::Pool)
use sqlx::postgres::PgPoolOptions;
// Configure pool size and connection limits
let pool = PgPoolOptions::new()
.max_connections(5)
.connect("postgres://...")
.await?;
// Use pool — gets connection from pool, returns on drop
let row = sqlx::query("SELECT 1")
.fetch_one(&pool)
.await?;
Diesel — Type-Safe Query Builder
// Define schema (from database introspection)
diesel::table! {
users {
id -> Integer,
name -> Text,
email -> Text,
}
}
// Define model
#[derive(Insertable)]
#[table_name = "users"]
struct NewUser {
name: String,
email: String,
}
// Type-safe queries
use diesel::prelude::*;
let users = users::table
.filter(users::email.like("%@example.com"))
.load::<User>(&connection)?;
// Insert
let new_user = NewUser { name: "Alice".into(), email: "alice@example.com".into() };
diesel::insert_into(users::table)
.values(&new_user)
.execute(&connection)?;
SeaORM — Async ActiveRecord
use sea_orm::prelude::*;
// Entity definition (codegen from DB)
#[derive(Clone, Debug, PartialEq, DeriveModel, DeriveActiveModel)]
pub struct Model {
pub id: i32,
pub name: String,
}
// Query (async)
let user = User::find_by_id(1)
.one(db)
.await?;
// Insert
let user = ActiveUser {
name: Set("Bob".to_string()),
..Default::default()
}
.save(db)
.await?;
// Delete
user.delete(db).await?;
Comparison: sqlx vs Diesel vs SeaORM
| Feature | sqlx | Diesel | SeaORM |
|---|---|---|---|
| Async | Yes | No (sync) | Yes |
| Type-safe queries | Compile-time checked | Query builder | Query builder |
| Migrations | Manual or Flyway | Built-in | Built-in |
| Query style | Raw SQL strings | Builder DSL | Builder DSL |
| Best for | Fast, direct SQL control | Type safety first | Modern async apps |
| Learning curve | Low | High | Medium |
Transactions
// sqlx transactions
let mut tx = pool.begin().await?;
sqlx::query("INSERT INTO accounts (balance) VALUES ($1)")
.bind(100)
.execute(&mut *tx)
.await?;
sqlx::query("UPDATE accounts SET balance = balance - 50 WHERE id = $1")
.bind(1)
.execute(&mut *tx)
.await?;
tx.commit().await?; // All or nothing
Migrations
// With sqlx CLI:
// sqlx migrate add -r create_users
// sqlx migrate run
// SQL migrations stored in migrations/ directory:
// migrations/20231201000001_create_users.up.sql
// migrations/20231201000001_create_users.down.sql
// Run migrations in code
use sqlx::migrate::Migrator;
let m = Migrator::new(std::path::Path::new("./migrations")).run(&pool).await?;
Redis
use redis::Commands;
let client = redis::Client::open("redis://127.0.0.1/")?;
let mut con = client.get_connection()?;
// SET / GET
con.set("key", "value")?;
let val: String = con.get("key")?;
// Lists, sets, hashes
con.rpush("list", vec![1, 2, 3])?;
let items: Vec<i32> = con.lrange("list", 0, -1)?;
Repository Pattern in Rust
trait UserRepository {
async fn get_by_id(&self, id: i32) -> Result<User>;
async fn save(&self, user: &User) -> Result<()>;
async fn delete(&self, id: i32) -> Result<()>;
}
// SQL implementation
struct SqlUserRepository {
pool: PgPool,
}
impl UserRepository for SqlUserRepository {
async fn get_by_id(&self, id: i32) -> Result<User> {
query_as::<_, User>("SELECT * FROM users WHERE id = $1")
.bind(id)
.fetch_one(&self.pool)
.await
.map_err(|e| e.into())
}
// ... other methods
}
Type-Safe IDs
// Prevent mixing up IDs (UserId != PostId)
#[derive(Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)]
struct UserId(i32);
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
struct PostId(i32);
// Now compile error: post_repo.get(user_id) // UserId != PostId
43. CLI & Argument Parsing Intermediate
Clap Derive API (Modern, Recommended)
use clap::{Parser, Subcommand};
#[derive(Parser)]
#[command(name = "MyApp")]
#[command(about = "A great CLI tool", long_about = None)]
struct Args {
/// Input file
#[arg(short, long)]
input: String,
/// Output file
#[arg(short, long, default_value = "output.txt")]
output: String,
/// Verbose mode
#[arg(short, long)]
verbose: bool,
/// Number of threads
#[arg(short, long, default_value_t = 4)]
threads: u32,
/// Read from environment variable
#[arg(long, env)]
api_key: Option<String>,
#[command(subcommand)]
command: Option<Commands>,
}
#[derive(Subcommand)]
enum Commands {
/// Process files
Process {
#[arg(value_name = "FILE")]
file: String,
},
/// Show status
Status,
}
fn main() {
let args = Args::parse();
match args.command {
Some(Commands::Process { file }) => {
println!("Processing {}", file);
},
Some(Commands::Status) => {
println!("Status OK");
},
None => {},
}
}
Clap Attributes Reference
| Attribute | Purpose | Example |
|---|---|---|
#[arg(short)] | Short flag (-v) | #[arg(short)] verbose: bool |
#[arg(long)] | Long flag (--verbose) | #[arg(long)] verbose: bool |
#[arg(short, long)] | Both short and long | #[arg(short, long)] |
#[arg(default_value)] | Default string value | #[arg(default_value = "out.txt")] |
#[arg(default_value_t)] | Default typed value | #[arg(default_value_t = 4)] |
#[arg(env)] | Read from env var | #[arg(env)] api_key: String |
#[arg(value_name)] | Metavar in help | #[arg(value_name = "FILE")] |
#[arg(required)] | Required argument | #[arg(required = true)] |
#[arg(num_args)] | Variadic args | #[arg(num_args = 1..)] |
Value Validation & Custom Parsers
// Built-in validators
#[derive(Parser)]
struct Args {
/// Port number (0-65535)
#[arg(short, long, value_parser = clap::value_parser!(u16))]
port: u16,
/// Only certain values allowed
#[arg(short, long, value_parser = ["json", "yaml", "toml"])]
format: String,
}
// Custom parser function
fn parse_level(s: &str) -> Result<Level, String> {
match s {
"info" => Ok(Level::Info),
"warn" => Ok(Level::Warn),
"error" => Ok(Level::Error),
_ => Err(format!("Invalid level: {}", s)),
}
}
#[derive(Parser)]
struct Args {
#[arg(short, long, value_parser = parse_level)]
log_level: Level,
}
Shell Completions
// Generate completions at build time or runtime
use clap_complete::{generate, shells};
fn main() {
if std::env::var("GENERATE_COMPLETIONS").is_ok() {
let mut cmd = Args::command();
generate(shells::Bash, &mut cmd, "myapp", &mut std::io::stdout());
}
}
Color Output
use colored::Colorize;
println!("{}", "Success!".green());
println!("{}", "Warning".yellow());
println!("{}", "Error".red());
// Or with owo-colors (more lightweight)
use owo_colors::OwoColorize;
println!("{}", "Styled".cyan().bold());
Progress Bars with indicatif
use indicatif::ProgressBar;
let pb = ProgressBar::new(100);
for i in 0..100 {
pb.inc(1);
std::thread::sleep(std::time::Duration::from_millis(100));
}
pb.finish_with_message("Done!");
// Spinner
let spinner = ProgressBar::new_spinner();
spinner.set_message("Loading...");
Interactive Prompts with dialoguer
use dialoguer::{Input, Confirm, Select};
// Text input
let name: String = Input::new()
.with_prompt("Your name")
.interact_text()?;
// Yes/no confirmation
let confirmed = Confirm::new()
.with_prompt("Continue?")
.interact()?;
// Multiple choice
let selection = Select::new()
.with_prompt("Choose option")
.items(&vec!["Option 1", "Option 2"])
.interact()?;
Configuration: Merge CLI + Config File + Env Vars
use figment::{Figment, providers::{Serialized, Env, Toml}};
use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize, Default)]
struct Config {
port: u16,
database_url: String,
}
let config: Config = Figment::new()
.merge(Serialized::defaults(Config::default()))
.merge(Toml::file("Config.toml"))
.merge(Env::prefixed("APP_"))
.extract()?;
Exit Codes & Error Reporting
// Exit with code
std::process::exit(1); // Error
std::process::exit(0); // Success
// Pretty error output with anyhow
use anyhow::{Result, Context};
fn main() {
if let Err(e) = run() {
eprintln!("Error: {:?}", e);
std::process::exit(1);
}
}
fn run() -> Result<()> {
let file = std::fs::read_to_string("config.toml")
.context("Failed to read config file")?;
Ok(())
}
Complete CLI App Template
use clap::Parser;
use anyhow::Result;
#[derive(Parser)]
#[command(name = "mytool")]
struct Cli {
#[arg(short, long)]
verbose: bool,
#[arg(long, default_value = "config.toml")]
config: String,
}
fn main() {
let cli = Cli::parse();
if let Err(e) = run(&cli) {
eprintln!("Error: {:?}", e);
std::process::exit(1);
}
}
fn run(cli: &Cli) -> Result<()> {
let config = std::fs::read_to_string(&cli.config)?;
// Your logic here
Ok(())
}
44. Logging & Tracing Intermediate
log Crate Facade
Lightweight abstraction for logging. Application chooses backend (env_logger, slog, tracing, etc.).
use log::{trace, debug, info, warn, error};
trace!("Very detailed");
debug!("Debugging info");
info!("General information");
warn!("Warning message");
error!("Error occurred");
// With structured data
info!("User login: {} from {}", user_id, ip);
env_logger — Simple Backend
fn main() {
env_logger::init();
info!("Starting application");
}
// Control with RUST_LOG env var:
// RUST_LOG=debug cargo run
// RUST_LOG=myapp::db=debug cargo run (filter by module)
tracing Crate — Spans & Events
Structured, async-aware instrumentation. Enables distributed tracing and context propagation.
use tracing::{info, debug, span, Level, instrument};
// Create span (context for related events)
let span = span!(Level::INFO, "request", request_id = "abc123");
let _guard = span.enter();
info!("Processing request"); // Associated with span
// Events with fields
info!(status = 200, "Request completed");
#[instrument] Attribute
Automatic span creation for functions.
use tracing::instrument;
#[instrument]
fn fetch_user(user_id: u32) -> Result<User> {
debug!("Fetching user from DB");
// DB query...
Ok(user)
}
// Generates span with function name and user_id field automatically
Subscribers: tracing-subscriber
use tracing_subscriber::{fmt, filter::EnvFilter};
// Pretty human-readable formatting
tracing_subscriber::fmt()
.with_max_level(Level::DEBUG)
.init();
// JSON output (structured logging)
tracing_subscriber::fmt()
.json()
.init();
// With filtering via env var
tracing_subscriber::fmt()
.with_env_filter(EnvFilter::from_default_env())
.init();
// Layer composition — multiple outputs
tracing_subscriber::fmt()
.with_writer(std::io::stderr) // Write to stderr
.with_ansi(true) // Use colors
.init();
Structured Logging with Fields
use tracing::{info, Span};
// Fields in events
info!(
user_id = 42,
email = "user@example.com",
status = "active",
"User login successful"
);
// Fields in spans
let span = span!(
request_id = uuid::Uuid::new_v4(),
http_method = "GET",
"http_request"
);
Span Hierarchy & Context Propagation
// Parent span — all nested operations inherit context
let request_span = span!(request_id = "req-123");
let _request_guard = request_span.enter();
let db_span = span!("db_query");
let _db_guard = db_span.enter();
info!("Query executed"); // Has both request_id and db_query context
// With async/await — Instrument trait
use tracing::Instrument;
async fn async_task() {
info!("Task running");
}
async_task()
.instrument(span!("my_task"))
.await;
tracing + tokio Integration
use tracing_subscriber::layer::SubscriberExt;
use tracing_subscriber::util::SubscriberInitExt;
#[tokio::main]
async fn main() {
tracing_subscriber::registry()
.with(tracing_subscriber::fmt::layer())
.init();
spawn_tasks().await;
}
async fn spawn_tasks() {
let task_span = span!("background_task");
tokio::spawn(
async {
info!("Task running in background");
}
.instrument(task_span)
);
}
Log Levels & Filtering
| Level | When to Use | Example |
|---|---|---|
| TRACE | Very detailed diagnostics (rare) | Every function entry/exit |
| DEBUG | Development debugging | Database queries, internal state |
| INFO | General information | Server started, user login |
| WARN | Something unexpected but recoverable | Slow query, deprecated API use |
| ERROR | Errors that need attention | Failed DB connection, parsing error |
// RUST_LOG environment variable
// RUST_LOG=info # All INFO and above
// RUST_LOG=debug # All DEBUG and above
// RUST_LOG=myapp::db=debug # Only myapp::db at DEBUG
// RUST_LOG=debug,myapp::db=trace # DEBUG for all, TRACE for myapp::db
OpenTelemetry Integration
use tracing_opentelemetry::OpenTelemetryLayer;
use opentelemetry::global;
let tracer = opentelemetry_jaeger::new_pipeline()
.install_simple()
.expect("Failed to install OpenTelemetry");
tracing_subscriber::registry()
.with(OpenTelemetryLayer::new(tracer))
.init();
// Now spans are sent to Jaeger for distributed tracing
Production Logging Patterns
- JSON output: Use
.json()for structured log collection (ELK, Splunk, Datadog) - Request IDs: Generate UUID for each request, pass through all operations for correlation
- Context propagation: Use spans to automatically include context in all child operations
- Log rotation: Use systemd journal or external tools (logrotate) to manage file size
- Performance: Use sampling for high-volume services (trace 1% of requests)
- Secrets: Never log passwords, keys, or sensitive PII. Use field redaction if needed
- Async logging: Use non-blocking loggers to prevent I/O from blocking application
- Error chains: Log full error context with source chains for debugging
log vs tracing — When to Use Each
| Feature | log | tracing |
|---|---|---|
| Simple logging | Yes (lightweight) | Overkill |
| Structured data | Limited | Excellent |
| Context propagation | Manual | Automatic (spans) |
| Async support | Via env_logger subscriber pattern | Native |
| Distributed tracing | Not designed for it | Built-in (OpenTelemetry) |
| Performance critical | Better (minimal overhead) | Good but higher |
| Use for | Simple apps, libraries | Complex async systems, microservices |
45. Memory Ordering & Atomics Advanced
Why Memory Ordering Matters
CPUs and compilers reorder instructions for performance. In single-threaded code this is invisible, but in multi-threaded code, one thread may observe another thread's writes in a different order than they were performed. Rust's std::sync::atomic types with explicit Ordering constraints give you fine-grained control over what other threads are guaranteed to see.
The Five Orderings
| Ordering | Guarantee | Use Case | Cost |
|---|---|---|---|
Relaxed | Atomic operation only — no ordering guarantees for other memory | Counters, statistics where exact cross-thread order doesn't matter | Cheapest |
Acquire | No reads/writes in current thread can be reordered before this load | Loading a flag/lock before reading shared data | Low |
Release | No reads/writes in current thread can be reordered after this store | Storing data then releasing a flag/lock | Low |
AcqRel | Both Acquire + Release (for read-modify-write ops like CAS) | compare_exchange, fetch_add when both loading and storing | Medium |
SeqCst | Total global ordering — all threads agree on the order of all SeqCst operations | When you need a single agreed-upon timeline (rare) | Most expensive |
SeqCst for correctness. Optimize to Acquire/Release pairs only when you understand the synchronization pattern. Use Relaxed only for independent counters.
Acquire-Release Pattern
The most common pattern: one thread prepares data and releases a flag, another thread acquires the flag and reads the data. The Acquire load is guaranteed to see all writes that happened before the matching Release store.
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::thread;
fn main() {
let data = Arc::new(std::sync::atomic::AtomicU64::new(0));
let ready = Arc::new(AtomicBool::new(false));
// Producer: write data, then release the flag
let d = data.clone();
let r = ready.clone();
thread::spawn(move || {
d.store(42, Ordering::Relaxed); // write data
r.store(true, Ordering::Release); // ← release barrier
});
// Consumer: acquire the flag, then read data
while !ready.load(Ordering::Acquire) {} // ← acquire barrier
let val = data.load(Ordering::Relaxed);
assert_eq!(val, 42); // guaranteed!
}
Compare-and-Exchange (CAS)
The fundamental building block for lock-free data structures. Atomically: "if current value equals expected, replace with new; otherwise return the actual current value."
use std::sync::atomic::{AtomicUsize, Ordering};
static COUNTER: AtomicUsize = AtomicUsize::new(0);
// Increment with CAS loop (lock-free)
fn increment() -> usize {
let mut current = COUNTER.load(Ordering::Relaxed);
loop {
match COUNTER.compare_exchange_weak(
current, // expected
current + 1, // new value
Ordering::AcqRel, // success ordering
Ordering::Relaxed, // failure ordering
) {
Ok(_) => return current,
Err(actual) => current = actual, // retry with actual value
}
}
}
// Simpler: use fetch_add for simple increments
let prev = COUNTER.fetch_add(1, Ordering::Relaxed);
_weak can spuriously fail on some architectures (ARM) but is faster in loops. Use _weak in CAS loops, compare_exchange (strong) when you need exactly-once semantics outside a loop.
Available Atomic Types
| Type | Operations | Notes |
|---|---|---|
AtomicBool | load, store, swap, compare_exchange | Flags, once-init guards |
AtomicU8/U16/U32/U64/Usize | + fetch_add, fetch_sub, fetch_or, fetch_and, fetch_xor | Counters, bit flags, indices |
AtomicI8/I16/I32/I64/Isize | Same as unsigned | Signed counters |
AtomicPtr<T> | load, store, swap, compare_exchange | Lock-free data structures (requires unsafe to deref) |
Common Lock-Free Patterns
Spinlock
use std::sync::atomic::{AtomicBool, Ordering};
use std::hint;
struct SpinLock { locked: AtomicBool }
impl SpinLock {
fn lock(&self) {
while self.locked.compare_exchange_weak(
false, true, Ordering::Acquire, Ordering::Relaxed
).is_err() {
while self.locked.load(Ordering::Relaxed) {
hint::spin_loop(); // hint CPU we're spinning
}
}
}
fn unlock(&self) {
self.locked.store(false, Ordering::Release);
}
}
Sequence Lock (SeqLock)
// Writer increments version before+after write (odd = writing)
// Reader reads version, reads data, re-reads version
// If versions match and are even → data is consistent
struct SeqLock<T> {
version: AtomicUsize,
data: UnsafeCell<T>,
}
impl<T: Copy> SeqLock<T> {
fn read(&self) -> T {
loop {
let v1 = self.version.load(Ordering::Acquire);
if v1 & 1 != 0 { continue; } // writer active
let data = unsafe { *self.data.get() };
let v2 = self.version.load(Ordering::Acquire);
if v1 == v2 { return data; }
}
}
}
46. Async Patterns & Pitfalls Advanced
Cancellation Safety
When you use tokio::select!, the losing branches are dropped (cancelled). If a future was mid-operation (e.g., halfway through reading from a channel), that data can be lost. A future is cancellation-safe if dropping it at any await point doesn't lose data.
| Operation | Cancel-Safe? | Why |
|---|---|---|
tokio::sync::mpsc::Receiver::recv() | ✅ Yes | Message stays in channel if cancelled |
tokio::io::AsyncReadExt::read() | ✅ Yes | No data consumed until Ready |
tokio::io::AsyncReadExt::read_exact() | ❌ No | Partial reads are lost on cancellation |
tokio::sync::Mutex::lock() | ✅ Yes | Lock attempt cancelled cleanly |
futures::StreamExt::next() | ✅ Yes | Stream unchanged if cancelled |
| Custom futures with internal buffers | ❌ Usually No | Buffer contents lost on drop |
use tokio::sync::mpsc;
async fn safe_select(rx: &mut mpsc::Receiver<String>) {
loop {
tokio::select! {
// ✅ recv() is cancellation-safe
msg = rx.recv() => {
match msg {
Some(m) => println!("Got: {m}"),
None => break,
}
}
_ = tokio::time::sleep(std::time::Duration::from_secs(5)) => {
println!("Timeout — no message in 5s");
}
}
}
}
Streams (Async Iterators)
A Stream is the async equivalent of Iterator. It yields items asynchronously over time. Common in event processing, WebSocket connections, and database cursors.
use tokio_stream::{StreamExt, wrappers::IntervalStream};
use std::time::Duration;
async fn stream_example() {
// Create a stream that ticks every second
let mut stream = IntervalStream::new(tokio::time::interval(Duration::from_secs(1)))
.take(5) // only take 5 items
.map(|_| "tick"); // transform items
while let Some(item) = stream.next().await {
println!("{item}");
}
}
// Custom stream with async-stream crate
use async_stream::stream;
fn fibonacci_stream() -> impl Stream<Item = u64> {
stream! {
let (mut a, mut b) = (0, 1);
loop {
yield a;
(a, b) = (b, a + b);
}
}
}
Structured Concurrency Patterns
Fan-Out / Fan-In
use tokio::task::JoinSet;
async fn fan_out_fan_in(urls: Vec<String>) -> Vec<String> {
let mut set = JoinSet::new();
// Fan-out: spawn concurrent tasks
for url in urls {
set.spawn(async move {
reqwest::get(&url).await?.text().await
});
}
// Fan-in: collect results as they complete
let mut results = Vec::new();
while let Some(res) = set.join_next().await {
if let Ok(Ok(body)) = res {
results.push(body);
}
}
results
}
Timeout Wrapper
use tokio::time::{timeout, Duration};
async fn with_timeout<T>(
fut: impl std::future::Future<Output = T>,
secs: u64,
) -> Result<T, &'static str> {
timeout(Duration::from_secs(secs), fut)
.await
.map_err(|_| "operation timed out")
}
Common Async Pitfalls
std::sync::MutexGuard is !Send. Holding it across an .await means the task can't be moved between threads. Use tokio::sync::Mutex if you must hold the guard across await points, or restructure to drop the guard before awaiting.
// ❌ BAD: std::sync::MutexGuard held across .await
let guard = std_mutex.lock().unwrap();
some_async_fn().await; // guard still held — won't compile!
drop(guard);
// ✅ GOOD: drop guard before .await
{
let guard = std_mutex.lock().unwrap();
// use guard...
} // guard dropped here
some_async_fn().await; // safe!
// ✅ GOOD: use tokio::sync::Mutex if guard must span .await
let guard = tokio_mutex.lock().await;
some_async_fn().await;
drop(guard);
spawn_blocking.
// ❌ BAD: blocks the async runtime
let data = std::fs::read_to_string("big_file.txt")?;
// ✅ GOOD: offload to blocking thread pool
let data = tokio::task::spawn_blocking(|| {
std::fs::read_to_string("big_file.txt")
}).await??;
// ✅ GOOD: use async file I/O
let data = tokio::fs::read_to_string("big_file.txt").await?;
// ❌ SLOW: sequential — total time = A + B
let a = fetch_a().await;
let b = fetch_b().await;
// ✅ FAST: concurrent — total time = max(A, B)
let (a, b) = tokio::join!(fetch_a(), fetch_b());
// ✅ FAST: with error handling
let (a, b) = tokio::try_join!(fetch_a(), fetch_b())?;
Tokio Channel Types Quick Reference
| Channel | Pattern | Bounded? | Use Case |
|---|---|---|---|
mpsc | Multi-producer, single-consumer | Both | Task → aggregator, work queues |
oneshot | Single-producer, single-consumer | 1 message | Request/response, task result |
broadcast | Multi-producer, multi-consumer | Bounded | Event bus, pub/sub notifications |
watch | Single-producer, multi-consumer | 1 value (latest) | Config reload, shared state |
47. Generics & Monomorphization Core
How Generics Work in Rust
Rust generics use monomorphization: the compiler generates a specialized copy of the function/struct for each concrete type used. This means Vec<i32> and Vec<String> are completely separate types in the binary with no runtime overhead, unlike Java's type erasure or Go's interface-based generics.
dyn Trait) if runtime dispatch cost is acceptable.
Trait Bounds — Four Syntaxes
// 1. Inline bound
fn print_it<T: Display + Debug>(val: &T) { ... }
// 2. Where clause (preferred for complex bounds)
fn process<T, U>(t: T, u: U) -> String
where
T: Display + Clone + Send + 'static,
U: Into<String> + Debug,
{ ... }
// 3. impl Trait (argument position = anonymous generic)
fn log(item: &impl Display) { ... } // sugar for fn log<T: Display>(item: &T)
// 4. impl Trait (return position = opaque type)
fn make_iter() -> impl Iterator<Item = i32> {
(0..10).filter(|x| x % 2 == 0)
}
Turbofish Syntax
When the compiler can't infer the type parameter, use the turbofish ::<Type>:
let x = "42".parse::<i32>().unwrap(); // turbofish on method
let v = Vec::<u8>::new(); // turbofish on type
let n = std::mem::size_of::<u64>(); // turbofish on function
// Alternative: type annotation instead of turbofish
let x: i32 = "42".parse().unwrap();
let v: Vec<u8> = Vec::new();
Const Generics
Parameterize types and functions by compile-time constant values, not just types. Stabilized in Rust 1.51+.
// Array wrapper parameterized by size
struct Matrix<const ROWS: usize, const COLS: usize> {
data: [[f64; COLS]; ROWS],
}
impl<const R: usize, const C: usize> Matrix<R, C> {
fn transpose(&self) -> Matrix<C, R> {
let mut result = Matrix { data: [[0.0; R]; C] };
for i in 0..R {
for j in 0..C {
result.data[j][i] = self.data[i][j];
}
}
result
}
}
// Size-checked at compile time!
let m: Matrix<3, 4> = Matrix { data: [[0.0; 4]; 3] };
let t: Matrix<4, 3> = m.transpose(); // dimensions enforced!
// Generic function with const generic
fn first_n<T: Copy + Default, const N: usize>(slice: &[T]) -> [T; N] {
let mut arr = [T::default(); N];
arr.copy_from_slice(&slice[..N]);
arr
}
PhantomData — Zero-Size Type Marker
PhantomData<T> tells the compiler your type logically owns or relates to a type T without actually storing it. Common uses: lifetimes, variance, and marker traits.
use std::marker::PhantomData;
// Type-state pattern: prevent misuse at compile time
struct Locked;
struct Unlocked;
struct Door<State> {
name: String,
_state: PhantomData<State>,
}
impl Door<Locked> {
fn unlock(self) -> Door<Unlocked> {
Door { name: self.name, _state: PhantomData }
}
}
impl Door<Unlocked> {
fn open(&self) { println!("Opening {}", self.name); }
fn lock(self) -> Door<Locked> {
Door { name: self.name, _state: PhantomData }
}
}
let door = Door::<Locked> { name: "Front".into(), _state: PhantomData };
// door.open(); // ❌ compile error — Door<Locked> has no open()
let door = door.unlock();
door.open(); // ✅ works — Door<Unlocked> has open()
Generic Associated Types (GATs)
GATs allow associated types in traits to have their own generic parameters. Stabilized in Rust 1.65.
// Lending iterator — yields references tied to each call
trait LendingIterator {
type Item<'a> where Self: 'a;
fn next<'a>(&'a mut self) -> Option<Self::Item<'a>>;
}
// Collection trait with GAT for flexible return types
trait Container {
type Iter<'a>: Iterator where Self: 'a;
fn iter<'a>(&'a self) -> Self::Iter<'a>;
}
48. Global State & Initialization Intermediate
The Problem with Global State in Rust
Rust's ownership system makes global mutable state intentionally difficult. static mut requires unsafe because the compiler can't verify exclusive access. The ecosystem provides safe alternatives for every use case.
const vs static
| Feature | const | static |
|---|---|---|
| Evaluated at | Compile time (inlined) | Compile time (single memory location) |
| Address | No fixed address (may be inlined) | Fixed address (&'static T) |
| Mutable? | Never | static mut (unsafe) |
| Interior mutability? | No | Yes (with Mutex, Atomic, etc.) |
| Drop? | No | No (statics are never dropped) |
| Use for | Mathematical constants, inline values | Shared state, lookup tables, strings |
OnceLock — Initialize Once, Read Many
Part of std since Rust 1.70. Thread-safe, one-time initialization. The gold standard for global config/state.
use std::sync::OnceLock;
static CONFIG: OnceLock<AppConfig> = OnceLock::new();
fn init_config(path: &str) {
let config = load_config(path);
CONFIG.set(config).expect("config already initialized");
}
fn get_config() -> &'static AppConfig {
CONFIG.get().expect("config not initialized")
}
// get_or_init: initialize on first access
fn config() -> &'static AppConfig {
CONFIG.get_or_init(|| load_config("config.toml"))
}
LazyLock — Lazy Initialization
Stabilized in Rust 1.80. Combines OnceLock with an initialization function — the value is computed on first access.
use std::sync::LazyLock;
use regex::Regex;
// Compiled regex — initialized on first use
static EMAIL_RE: LazyLock<Regex> = LazyLock::new(|| {
Regex::new(r"^[\w.+-]+@[\w-]+\.[\w.]+$").unwrap()
});
// Database connection pool — created on first access
static DB_POOL: LazyLock<Pool<Postgres>> = LazyLock::new(|| {
let url = std::env::var("DATABASE_URL").unwrap();
Pool::connect_lazy(&url).unwrap()
});
thread_local! — Per-Thread State
Each thread gets its own independent copy. No synchronization needed. Great for caches, RNGs, or thread-specific context.
use std::cell::RefCell;
thread_local! {
static BUFFER: RefCell<Vec<u8>> = RefCell::new(Vec::with_capacity(1024));
static RNG: RefCell<SmallRng> = RefCell::new(SmallRng::from_entropy());
}
fn use_buffer() {
BUFFER.with_borrow_mut(|buf| { // with_borrow_mut since Rust 1.79
buf.clear();
buf.extend_from_slice(&[1, 2, 3]);
});
}
When to Use What
| Need | Solution | Example |
|---|---|---|
| Compile-time constant | const | const PI: f64 = 3.14159; |
| Fixed immutable data | static | static NAMES: &[&str] = &["a", "b"]; |
| Initialize once at startup | OnceLock | Config, DB pool |
| Initialize lazily on first use | LazyLock | Compiled regex, computed lookup |
| Thread-local cache/RNG | thread_local! | Per-thread buffers, counters |
| Read-heavy shared state | RwLock<T> | Caches with rare updates |
| Write-heavy shared state | Mutex<T> | Shared queues, counters |
| Atomic counters/flags | AtomicU64 etc. | Statistics, feature flags |
49. Debugging & Profiling Intermediate
Print Debugging
// dbg! macro — prints file:line and expression value, returns it
let x = dbg!(5 * 2); // prints: [src/main.rs:3] 5 * 2 = 10
// Chain dbg! in expressions
let result = dbg!(dbg!(x) + dbg!(3));
// Debug vs Display formatting
println!("{:?}", value); // Debug (derive-able)
println!("{:#?}", value); // Pretty-printed Debug
println!("{}", value); // Display (manual impl)
// eprintln! for stderr (won't mix with stdout)
eprintln!("DEBUG: state = {:?}", state);
GDB / LLDB
Rust produces standard DWARF debug info. Use rust-gdb or rust-lldb for Rust-aware pretty-printing.
# Build with debug info (default for cargo build without --release)
$ cargo build
$ rust-gdb target/debug/my_app
# Common GDB commands
(gdb) break main # set breakpoint at main
(gdb) break src/lib.rs:42 # set breakpoint at line 42
(gdb) run # start execution
(gdb) next # step over
(gdb) step # step into
(gdb) print my_var # print variable
(gdb) print *my_vec # print Vec contents
(gdb) backtrace # show call stack
(gdb) info locals # all local variables
(gdb) continue # resume execution
Miri — Undefined Behavior Detector
Miri interprets Rust MIR and detects undefined behavior in unsafe code: out-of-bounds access, use-after-free, data races, invalid references, uninitialized memory.
# Install and run
$ rustup +nightly component add miri
$ cargo +nightly miri run
$ cargo +nightly miri test
# Configure detection flags
$ MIRIFLAGS="-Zmiri-disable-stacked-borrows" cargo +nightly miri test
$ MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test # newer model
Profiling with Flamegraphs
# Install flamegraph tool
$ cargo install flamegraph
# Profile your release build
$ cargo flamegraph --release -- --bench my_bench
# Outputs: flamegraph.svg (open in browser)
# For perf on Linux
$ cargo build --release
$ perf record -g target/release/my_app
$ perf report
Sanitizers (Nightly)
# AddressSanitizer — buffer overflows, use-after-free
$ RUSTFLAGS="-Zsanitizer=address" cargo +nightly run
# ThreadSanitizer — data races
$ RUSTFLAGS="-Zsanitizer=thread" cargo +nightly run
# MemorySanitizer — reads of uninitialized memory
$ RUSTFLAGS="-Zsanitizer=memory" cargo +nightly run
Cargo Debug Utilities
| Tool | Command | Purpose |
|---|---|---|
| cargo-expand | cargo expand | Show macro-expanded code |
| cargo-asm | cargo asm my_crate::func | View assembly output of a function |
| cargo-llvm-lines | cargo llvm-lines | Show which generics cause most code bloat |
| cargo-bloat | cargo bloat --release | Show what's taking space in the binary |
| cargo-udeps | cargo udeps | Find unused dependencies |
| cargo-audit | cargo audit | Check for known security vulnerabilities |
| cargo-deny | cargo deny check | License and security policy enforcement |
Compile-Time Debugging
// Print type at compile time (causes error showing the type)
let x = complex_expression();
let _: () = x; // error: expected `()`, found `HashMap<String, Vec<i32>>`
// std::any::type_name for runtime type inspection
fn print_type<T>(_: &T) {
println!("Type: {}", std::any::type_name::<T>());
}
// compile_error! for intentional build failures in macros/cfg
#[cfg(not(any(feature = "postgres", feature = "sqlite")))]
compile_error!("Must enable either 'postgres' or 'sqlite' feature");
50. Cross-Compilation & Targets Intermediate
Target Triples
A target triple describes the platform: <arch>-<vendor>-<os>-<abi>. Rust supports 80+ targets.
| Target | Description |
|---|---|
x86_64-unknown-linux-gnu | Linux x86_64 (glibc) |
x86_64-unknown-linux-musl | Linux x86_64 (static binary, musl libc) |
aarch64-unknown-linux-gnu | Linux ARM64 (Raspberry Pi 4, AWS Graviton) |
x86_64-apple-darwin | macOS x86_64 (Intel) |
aarch64-apple-darwin | macOS ARM64 (Apple Silicon) |
x86_64-pc-windows-msvc | Windows x86_64 |
wasm32-unknown-unknown | WebAssembly (no WASI) |
thumbv7em-none-eabihf | ARM Cortex-M (embedded, no OS) |
# List all available targets
$ rustup target list
# Add a target
$ rustup target add aarch64-unknown-linux-gnu
# Build for specific target
$ cargo build --target aarch64-unknown-linux-gnu --release
# Check default target
$ rustc --print cfg | grep target
Using cross (Docker-based cross-compilation)
The cross tool provides pre-built Docker images with the correct toolchains. No manual linker setup needed.
# Install cross
$ cargo install cross
# Build for ARM Linux (uses Docker automatically)
$ cross build --target aarch64-unknown-linux-gnu --release
# Run tests on target platform (via QEMU in Docker)
$ cross test --target aarch64-unknown-linux-gnu
Static Binaries with musl
Build fully static binaries (no dynamic library dependencies) for easy deployment:
# Add musl target
$ rustup target add x86_64-unknown-linux-musl
# Build static binary
$ cargo build --target x86_64-unknown-linux-musl --release
# Verify it's static
$ file target/x86_64-unknown-linux-musl/release/myapp
# myapp: ELF 64-bit LSB executable, x86-64, statically linked
$ ldd target/x86_64-unknown-linux-musl/release/myapp
# not a dynamic executable
Platform-Specific Code
// Conditional compilation by target OS
#[cfg(target_os = "linux")]
fn get_memory() -> u64 { /* read /proc/meminfo */ }
#[cfg(target_os = "macos")]
fn get_memory() -> u64 { /* use sysctl */ }
#[cfg(target_os = "windows")]
fn get_memory() -> u64 { /* use GlobalMemoryStatusEx */ }
// Conditional dependencies in Cargo.toml
// [target.'cfg(target_os = "linux")'.dependencies]
// procfs = "0.16"
// Architecture-specific
#[cfg(target_arch = "x86_64")]
fn fast_path() { /* use SIMD */ }
#[cfg(not(target_arch = "x86_64"))]
fn fast_path() { /* scalar fallback */ }
// Combine conditions
#[cfg(all(target_os = "linux", target_arch = "aarch64"))]
fn arm_linux_specific() { }
CI/CD Cross-Compilation Matrix
# GitHub Actions example
# .github/workflows/release.yml
strategy:
matrix:
include:
- target: x86_64-unknown-linux-gnu
os: ubuntu-latest
- target: x86_64-unknown-linux-musl
os: ubuntu-latest
- target: aarch64-unknown-linux-gnu
os: ubuntu-latest
- target: x86_64-apple-darwin
os: macos-latest
- target: aarch64-apple-darwin
os: macos-latest
- target: x86_64-pc-windows-msvc
os: windows-latest
steps:
- uses: actions/checkout@v4
- run: rustup target add ${{ matrix.target }}
- run: cargo build --release --target ${{ matrix.target }}
51. Send & Sync Deep Dive Advanced
What They Mean
Send and Sync are auto traits (also called marker traits) that the compiler derives automatically for your types. They have no methods — they exist purely to express thread-safety invariants at compile time. If your type contains only Send/Sync fields, it is automatically Send/Sync.
The Formal Definitions
// From std::marker (simplified)
pub unsafe auto trait Send { }
pub unsafe auto trait Sync { }
// "auto" means: compiler implements it for types whose fields are all Send/Sync
// "unsafe" means: manual impl requires YOU to guarantee the invariant
// The critical relationship:
// T: Sync implies &T: Send
// This is why Sync exists — it's about what &T can do
Why Rc<T> is !Send and !Sync
Rc<T> uses a non-atomic reference count. If two threads increment/decrement it simultaneously, the count gets corrupted (data race → use-after-free or double-free). The fix: Arc<T> uses atomic operations for the count.
use std::rc::Rc;
use std::sync::Arc;
let rc = Rc::new(42);
// std::thread::spawn(move || println!("{}", rc)); // ❌ Rc<i32> is !Send
let arc = Arc::new(42);
std::thread::spawn(move || println!("{}", arc)); // ✅ Arc<i32> is Send + Sync
Why Cell<T> and RefCell<T> are !Sync
Both provide interior mutability through shared references (&self). If &Cell<T> were Send (which requires Cell<T>: Sync), two threads could call .set() simultaneously — a data race. They use non-atomic operations internally, so concurrent access is undefined behavior.
use std::cell::{Cell, RefCell};
// Cell<T>: Send (if T: Send), but NOT Sync
// You can move a Cell to another thread, but you can't share &Cell across threads
let cell = Cell::new(42);
std::thread::spawn(move || cell.set(99)); // ✅ moved (Send)
// But sharing is forbidden:
// let cell = Cell::new(42);
// let r = &cell;
// std::thread::spawn(move || r.set(99)); // ❌ &Cell is !Send (Cell is !Sync)
Thread-Safe Equivalents
| Single-Threaded (!Sync) | Thread-Safe Equivalent | Mechanism |
|---|---|---|
Rc<T> | Arc<T> | Atomic reference counting |
Cell<T> | AtomicT / Mutex<T> | Atomic ops or locking |
RefCell<T> | RwLock<T> / Mutex<T> | OS locking primitives |
Rc<RefCell<T>> | Arc<Mutex<T>> | Shared ownership + locking |
*mut T | Wrapper with unsafe impl Send/Sync | You guarantee safety |
Comprehensive Type Reference
| Type | Send | Sync | Reasoning |
|---|---|---|---|
i32, f64, bool, char | ✅ | ✅ | No pointers, trivially copyable |
String, Vec<T> | ✅ (if T: Send) | ✅ (if T: Sync) | Owned heap data, no shared state |
&T | ✅ (if T: Sync) | ✅ (if T: Sync) | Immutable shared reference |
&mut T | ✅ (if T: Send) | ✅ (if T: Send) | Exclusive reference = safe to move |
Box<T> | ✅ (if T: Send) | ✅ (if T: Sync) | Unique ownership of heap data |
Rc<T> | ❌ | ❌ | Non-atomic refcount |
Arc<T> | ✅ (if T: Send+Sync) | ✅ (if T: Send+Sync) | Atomic refcount |
Cell<T> | ✅ (if T: Send) | ❌ | Interior mutability not thread-safe |
RefCell<T> | ✅ (if T: Send) | ❌ | Non-atomic borrow tracking |
Mutex<T> | ✅ (if T: Send) | ✅ (if T: Send) | Lock guarantees exclusive access |
RwLock<T> | ✅ (if T: Send) | ✅ (if T: Send+Sync) | Multiple readers or one writer |
MutexGuard<T> | ❌ | ✅ (if T: Sync) | Must unlock on same thread (POSIX) |
*const T, *mut T | ❌ | ❌ | Compiler can't verify safety |
AtomicU64 | ✅ | ✅ | Hardware-level atomic operations |
mpsc::Sender<T> | ✅ (if T: Send) | ❌ | Designed for Send, not shared |
mpsc::Receiver<T> | ✅ (if T: Send) | ❌ | Single consumer |
Auto Trait Rules
// A struct is Send if ALL its fields are Send
struct MyStruct {
name: String, // Send ✅
count: Arc<Mutex<i32>>, // Send ✅
}
// MyStruct: Send ✅ (all fields are Send)
struct NotSend {
name: String, // Send ✅
local: Rc<i32>, // Send ❌
}
// NotSend: Send ❌ (one field is !Send → whole struct is !Send)
Negative Implementations (Opt-Out)
// The standard library opts OUT of Send/Sync for certain types:
impl !Send for Rc<T> {}
impl !Sync for Rc<T> {}
impl !Sync for Cell<T> {}
impl !Sync for RefCell<T> {}
// You can opt out for your own types too (nightly-only syntax):
// impl !Send for MyType {}
// Stable workaround: include a !Send field
use std::marker::PhantomData;
struct NotSendOrSync {
data: i32,
_marker: PhantomData<*const ()>, // *const () is !Send + !Sync
}
Unsafe Manual Implementations
When wrapping raw pointers or FFI types, you may need to manually assert thread safety. This is unsafe because you're telling the compiler "trust me."
// Wrapper around a raw pointer that we guarantee is thread-safe
struct FfiHandle {
ptr: *mut std::ffi::c_void,
}
// SAFETY: The underlying C library is thread-safe.
// The handle can be sent to another thread (transferred).
unsafe impl Send for FfiHandle {}
// SAFETY: The underlying C library uses internal locking.
// Multiple threads can call functions with &FfiHandle safely.
unsafe impl Sync for FfiHandle {}
// Common in real-world crates:
// - Database connection pools (diesel, sqlx)
// - Window handles (winit, raw-window-handle)
// - GPU resources (wgpu, vulkano)
Send/Sync is one of the most dangerous things in Rust. It can cause data races, use-after-free, and undefined behavior that the borrow checker cannot catch. Always document why the implementation is safe with a // SAFETY: comment.
How Send/Sync Interact with Async
In async Rust, futures may be polled on different threads. This means any data held across an .await point inside a tokio::spawn task must be Send.
// tokio::spawn requires: Future: Send + 'static
let rc = Rc::new(42);
// tokio::spawn(async move {
// println!("{}", rc); // ❌ Rc is !Send
// });
let arc = Arc::new(42);
tokio::spawn(async move {
println!("{}", arc); // ✅ Arc is Send
});
// MutexGuard across .await — common mistake:
// let guard = std_mutex.lock().unwrap();
// some_async_fn().await; // ❌ MutexGuard is !Send
// Use tokio::sync::Mutex instead for async-safe locking
Decision Flowchart
Using Send & Sync Explicitly — Complete Guide
1. Requiring Send/Sync in Function Bounds
The most common explicit use: constraining generic parameters so your function only accepts thread-safe types.
use std::thread;
// Require T: Send to move it to another thread
fn spawn_with<T: Send + 'static>(value: T) -> thread::JoinHandle<T> {
thread::spawn(move || {
println!("Processing on another thread");
value
})
}
// Require T: Send + Sync to share &T across threads
fn share_across_threads<T: Send + Sync + 'static>(value: T) {
let shared = std::sync::Arc::new(value);
for _ in 0..4 {
let clone = shared.clone();
thread::spawn(move || {
println!("Reading: {:?}", &*clone);
});
}
}
// Why 'static? thread::spawn requires it because the thread
// might outlive the caller — so T can't borrow short-lived data.
// Scoped threads (std::thread::scope) remove this restriction.
2. Using Send/Sync as Trait Object Bounds
When using dyn Trait, you must explicitly add Send/Sync bounds if the trait object needs to cross threads.
use std::sync::Arc;
// A trait object that can be shared across threads
type SharedHandler = Arc<dyn Handler + Send + Sync>;
// A closure that can be sent to another thread
type Task = Box<dyn FnOnce() + Send + 'static>;
// Thread pool using Send closures
struct ThreadPool {
sender: std::sync::mpsc::Sender<Task>,
}
impl ThreadPool {
fn execute<F>(&self, job: F)
where
F: FnOnce() + Send + 'static, // ← explicit Send bound
{
self.sender.send(Box::new(job)).unwrap();
}
}
// Trait with Send + Sync supertraits
trait Service: Send + Sync {
fn handle(&self, req: Request) -> Response;
}
// Now every implementor of Service is automatically Send + Sync
// and Arc<dyn Service> works without extra bounds
3. Async-Specific Send Bounds
tokio::spawn requires Future: Send + 'static. This means everything held across .await must be Send. Here's how to handle it:
// ✅ Using Send bound on async trait methods
trait AsyncService: Send + Sync {
// The returned future must be Send for tokio::spawn
fn process(&self) -> impl std::future::Future<Output = String> + Send;
}
// ✅ Spawnable async function with Send constraint
async fn spawn_task<F, T>(future: F) -> T
where
F: std::future::Future<Output = T> + Send + 'static,
T: Send + 'static,
{
tokio::spawn(future).await.unwrap()
}
// ❌ PROBLEM: Rc held across .await makes future !Send
async fn bad() {
let data = std::rc::Rc::new(42); // Rc is !Send
some_async_op().await; // data lives across .await
println!("{data}"); // → entire future is !Send
}
// tokio::spawn(bad()); // ❌ compile error
// ✅ FIX 1: Use Arc instead of Rc
async fn good_arc() {
let data = std::sync::Arc::new(42);
some_async_op().await;
println!("{data}");
}
// ✅ FIX 2: Drop the !Send value before .await
async fn good_drop() {
{
let data = std::rc::Rc::new(42);
println!("{data}");
} // Rc dropped here
some_async_op().await; // no !Send data held → future is Send
}
// ✅ FIX 3: Use spawn_local for !Send futures (single-threaded)
tokio::task::spawn_local(async {
let data = std::rc::Rc::new(42);
some_async_op().await;
println!("{data}"); // ✅ spawn_local doesn't require Send
});
4. Implementing Send/Sync for Custom Types (unsafe)
When your type wraps raw pointers or FFI handles, the compiler can't verify thread safety — you must assert it manually.
// ─── Example 1: FFI Handle Wrapper ───
struct DatabaseConn {
handle: *mut std::ffi::c_void, // raw pointer → auto !Send !Sync
}
// SAFETY: The underlying C database library:
// - Handles can be transferred between threads (Send)
// - Uses internal locking for concurrent reads (Sync)
// This was verified by reading the library docs and source code.
unsafe impl Send for DatabaseConn {}
unsafe impl Sync for DatabaseConn {}
impl DatabaseConn {
fn query(&self, sql: &str) -> Vec<Row> {
unsafe { ffi_query(self.handle, sql.as_ptr()) }
}
}
// ─── Example 2: Send but NOT Sync ───
struct GpuBuffer {
ptr: *mut u8,
len: usize,
}
// SAFETY: GPU buffers can be transferred between threads,
// but must not be accessed concurrently (no internal locking).
unsafe impl Send for GpuBuffer {}
// NOT implementing Sync — concurrent &GpuBuffer access is UB
// Use with Mutex to make it safe for shared access:
let shared_buf = std::sync::Arc::new(std::sync::Mutex::new(GpuBuffer { /*...*/ }));
- ALWAYS write a
// SAFETY:comment explaining why it's safe - For
Send: the type must be safe to destructively move to another thread - For
Sync:&Selfmust be safe to use from multiple threads simultaneously - If in doubt, implement only
Sendand wrap inMutexfor shared access - Test with
cargo +nightly miri testto detect data races
5. Opting OUT of Send/Sync
use std::marker::PhantomData;
// Method 1: PhantomData with a !Send type (stable)
struct NotSend {
data: i32,
_marker: PhantomData<*const ()>, // *const () is !Send + !Sync
}
// Method 2: PhantomData with Rc (also !Send + !Sync)
struct ThreadLocal {
data: i32,
_no_send: PhantomData<std::rc::Rc<()>>,
}
// Method 3: Negative impl (nightly only)
// #![feature(negative_impls)]
// impl !Send for MyType {}
// impl !Sync for MyType {}
// Why opt out? When your type has safety invariants
// tied to a specific thread (e.g., GUI handles, thread-local caches)
6. Compile-Time Assertions
Verify your types implement Send/Sync at compile time:
// Static assertion — fails at compile time if not satisfied
fn assert_send<T: Send>() {}
fn assert_sync<T: Sync>() {}
fn assert_send_sync<T: Send + Sync>() {}
// Use in tests or const blocks
#[test]
fn my_types_are_thread_safe() {
assert_send::<MyService>();
assert_sync::<MyService>();
assert_send::<DatabaseConn>();
assert_send_sync::<Arc<MyService>>();
// These would fail to compile (which is correct!):
// assert_send::<Rc<i32>>(); // Rc is !Send
// assert_sync::<Cell<i32>>(); // Cell is !Sync
}
// Const assertion (no test runner needed)
const _: () = {
fn check<T: Send + Sync>() {}
check::<MyService>(); // compile error if MyService isn't Send+Sync
};
7. Real-World Pattern: Type-Erased Task Queue
use std::sync::{Arc, Mutex};
// A type-erased, thread-safe task queue
struct TaskQueue {
tasks: Mutex<Vec<Box<dyn FnOnce() + Send>>>,
// ^^^^^^^^^^^^^^^^ ^^^^
// type-erased closure must be Send
// to move between threads
}
impl TaskQueue {
fn push(&self, task: impl FnOnce() + Send + 'static) {
self.tasks.lock().unwrap().push(Box::new(task));
}
fn run_next(&self) -> bool {
if let Some(task) = self.tasks.lock().unwrap().pop() {
task(); // execute the closure
true
} else { false }
}
}
// TaskQueue itself is Send + Sync because:
// - Mutex<T> is Sync if T is Send (Vec<Box<dyn FnOnce()+Send>> is Send) ✅
// - Mutex<T> is Send if T is Send ✅
// So Arc<TaskQueue> can be shared across threads:
let queue = Arc::new(TaskQueue { tasks: Mutex::new(Vec::new()) });
let q = queue.clone();
std::thread::spawn(move || {
q.push(|| println!("task from thread!"));
});
Summary Cheat Sheet
| What You Want | Bound to Use | Example |
|---|---|---|
| Move value to a thread | T: Send + 'static | thread::spawn, tokio::spawn |
Share &T across threads | T: Sync (makes &T: Send) | Arc<T> shared references |
| Thread-safe trait object | dyn Trait + Send + Sync | Service handlers, callbacks |
| Async spawn | Future + Send + 'static | tokio::spawn(fut) |
| Thread-safe closure | FnOnce() + Send + 'static | Thread pools, task queues |
| Assert at compile time | fn check<T: Send + Sync>() {} | Test/const blocks |
| Opt out (make !Send) | PhantomData<*const ()> | Thread-local handles |
| Force Send on FFI | unsafe impl Send | Raw pointer wrappers |
52. Coroutines & Generators Advanced / Nightly
What Are Coroutines?
A coroutine is a function that can suspend its execution (yield) and be resumed later from where it left off, preserving its local state. Unlike regular functions (which run to completion), coroutines have multiple entry/exit points.
In Rust, coroutines are the foundation underlying both async/await and generators. The compiler transforms async fn into coroutines (state machines), and gen blocks are syntactic sugar for coroutines that yield iterator items.
The Coroutine Hierarchy in Rust
Low-Level Coroutines (Nightly)
The Coroutine trait is the raw building block. It's what the compiler generates behind the scenes for async and gen. You rarely use it directly, but understanding it reveals how everything works.
#![feature(coroutines, coroutine_trait)]
use std::ops::{Coroutine, CoroutineState};
use std::pin::Pin;
// The Coroutine trait (simplified from std)
// trait Coroutine<R = ()> {
// type Yield; // type of values yielded at suspension points
// type Return; // type of the final return value
// fn resume(self: Pin<&mut Self>, arg: R) -> CoroutineState<Yield, Return>;
// }
//
// enum CoroutineState<Y, R> {
// Yielded(Y), // coroutine suspended, here's the yielded value
// Complete(R), // coroutine finished, here's the return value
// }
fn main() {
// Create a coroutine using closure-like syntax with yield
let mut coro = #[coroutine] || {
println!("Step 1");
yield 1; // suspend, yield 1
println!("Step 2");
yield 2; // suspend, yield 2
println!("Step 3");
"done" // return value
};
// Drive the coroutine by calling resume()
let mut pinned = Pin::new(&mut coro);
match pinned.as_mut().resume(()) {
CoroutineState::Yielded(val) =>
println!("Yielded: {val}"), // "Step 1" then "Yielded: 1"
CoroutineState::Complete(_) => unreachable!(),
}
match pinned.as_mut().resume(()) {
CoroutineState::Yielded(val) =>
println!("Yielded: {val}"), // "Step 2" then "Yielded: 2"
CoroutineState::Complete(_) => unreachable!(),
}
match pinned.as_mut().resume(()) {
CoroutineState::Yielded(_) => unreachable!(),
CoroutineState::Complete(ret) =>
println!("Complete: {ret}"), // "Step 3" then "Complete: done"
}
}
Coroutines with Resume Arguments
Coroutines can receive values on each resume, enabling two-way communication:
#![feature(coroutines, coroutine_trait)]
use std::ops::{Coroutine, CoroutineState};
use std::pin::Pin;
fn main() {
// Coroutine that receives String on each resume
let mut coro = #[coroutine] |input: String| {
println!("First input: {input}");
let next: String = yield input.len(); // yield length, receive next input
println!("Second input: {next}");
next.len() + input.len() // return final value
};
let mut pinned = Pin::new(&mut coro);
// Resume with "hello" → yields 5
let CoroutineState::Yielded(len) = pinned.as_mut().resume("hello".into())
else { panic!() };
println!("Got length: {len}"); // 5
// Resume with "world" → completes with 10
let CoroutineState::Complete(total) = pinned.as_mut().resume("world".into())
else { panic!() };
println!("Total: {total}"); // 10
}
Gen Blocks — Coroutines as Iterators (Nightly)
gen blocks are the ergonomic way to create iterators using coroutine syntax. Each yield produces the next iterator item. RFC 3513, available on nightly.
#![feature(gen_blocks)]
// Simple generator — replaces manual Iterator impl
fn fibonacci() -> impl Iterator<Item = u64> {
gen {
let (mut a, mut b) = (0u64, 1u64);
loop {
yield a;
(a, b) = (b, a + b);
}
}
}
// With early return — iterator stops when gen block returns
fn primes_up_to(limit: u64) -> impl Iterator<Item = u64> {
gen move {
for n in 2..=limit {
if (2..n).all(|d| n % d != 0) {
yield n;
}
}
// implicit return here → iterator yields None
}
}
// Iterate like any other iterator
for p in primes_up_to(50) {
print!("{p} "); // 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47
}
// Use with iterator adapters
let sum: u64 = fibonacci().take(20).sum();
Async Gen Blocks — Coroutines as Streams (Nightly)
Combine async + gen to produce async streams — items that are yielded asynchronously:
#![feature(gen_blocks, async_iterator)]
use std::async_iter::AsyncIterator;
// Async generator — yields items asynchronously
async fn fetch_pages(urls: Vec<String>) -> impl AsyncIterator<Item = String> {
async gen {
for url in urls {
let body = reqwest::get(&url).await
.unwrap().text().await.unwrap();
yield body; // yield each page as it's fetched
}
}
}
// Consume with for await (nightly syntax)
// for await page in fetch_pages(urls) {
// process(page);
// }
How the Compiler Transforms Coroutines
The compiler turns coroutines into the same state machine enum pattern used for async. Each yield/.await becomes a state variant:
// Your gen block:
gen {
let x = 10;
yield x; // state boundary
let y = x + 20;
yield y; // state boundary
yield x + y; // state boundary
}
// Compiler generates roughly:
enum GenState {
Start,
AfterYield1 { x: i32 }, // x lives across yield
AfterYield2 { x: i32, y: i32 }, // x and y live across yield
Done,
}
// The Iterator::next() impl matches on current state,
// runs code until next yield, transitions state, returns Some(value).
// When the gen block exits, returns None.
Gen Blocks vs Manual Iterator
| Aspect | Manual impl Iterator | gen Block |
|---|---|---|
| Boilerplate | Define struct + impl Iterator + next() | Just gen { yield ... } |
| State management | You track state fields manually | Compiler tracks local variables |
| Readability | State machine logic split across methods | Sequential, imperative code |
| Performance | Identical (same state machine generated) | Identical (zero overhead) |
| Flexibility | Can implement DoubleEndedIterator, ExactSizeIterator | Only Iterator (for now) |
| Stability | Stable Rust | Nightly only (as of March 2026) |
// Manual Iterator — 20+ lines
struct Range { current: i32, end: i32 }
impl Iterator for Range {
type Item = i32;
fn next(&mut self) -> Option<i32> {
if self.current < self.end {
let val = self.current;
self.current += 1;
Some(val)
} else { None }
}
}
// Gen block — 5 lines, same result
fn range(start: i32, end: i32) -> impl Iterator<Item = i32> {
gen move { for i in start..end { yield i; } }
}
Coroutines vs Async vs Gen — Relationship
| Feature | Trait Produced | Suspension | Use Case | Status |
|---|---|---|---|---|
async { } | impl Future | .await | Non-blocking I/O, concurrency | Stable |
gen { } | impl Iterator | yield | Lazy sequences, data pipelines | Nightly |
async gen { } | impl AsyncIterator | yield + .await | Async streams (paginated APIs, WebSockets) | Nightly |
#[coroutine] || { } | impl Coroutine | yield + resume args | Low-level state machines, interpreters | Nightly |
async and gen are just ergonomic layers on top of the raw Coroutine trait, restricting how yield/resume work to match their domain (futures vs iterators).
53. The Poll Model — How Futures Are Driven Advanced
The Future Trait
Every async operation in Rust ultimately implements the Future trait. It has exactly one method: poll().
pub trait Future {
type Output;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
pub enum Poll<T> {
Ready(T), // value is available
Pending, // not ready yet — will call waker when it is
}
poll() returns Pending, the future must have arranged for cx.waker().wake() to be called when progress can be made. If it doesn't, the task will never be polled again and will hang forever.
Poll Parameters Explained
self: Pin<&mut Self>
The future is accessed through a Pin reference. This guarantees the future won't be moved in memory between polls — critical because the compiler-generated state machine may contain self-references (pointers to its own local variables).
// Why Pin? Consider what .await compiles to:
async fn example() {
let data = vec![1, 2, 3];
let slice = &data[..]; // slice points into data
some_async_op().await; // yield point — state machine stores both
println!("{slice:?}"); // slice still needs to be valid!
}
// If the future were moved in memory after the yield,
// `slice` would be a dangling pointer. Pin prevents this.
cx: &mut Context<'_>
Context carries the Waker — the handle used to notify the executor that the future should be polled again.
// Context is essentially a wrapper around Waker:
pub struct Context<'a> {
waker: &'a Waker,
// ... (internal fields)
}
impl<'a> Context<'a> {
pub fn waker(&self) -> &'a Waker { self.waker }
}
The Waker — How the Executor Knows to Repoll
// Waker is Clone + Send + Sync — can be shared across threads
let waker: Waker = cx.waker().clone();
// Common pattern: stash waker for later notification
struct MyFuture {
shared_state: Arc<Mutex<SharedState>>,
}
struct SharedState {
completed: bool,
waker: Option<Waker>,
}
impl Future for MyFuture {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<()> {
let mut state = self.shared_state.lock().unwrap();
if state.completed {
Poll::Ready(())
} else {
// Store the waker so the background thread can wake us
state.waker = Some(cx.waker().clone());
Poll::Pending
}
}
}
// Background thread completes the work and wakes the task:
fn background_work(state: Arc<Mutex<SharedState>>) {
// ... do slow work ...
let mut s = state.lock().unwrap();
s.completed = true;
if let Some(waker) = s.waker.take() {
waker.wake(); // ← tells executor to poll our future again
}
}
Implementing Custom Futures with poll()
Example 1: Timer Future
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
use std::time::{Duration, Instant};
use std::thread;
struct TimerFuture {
shared: Arc<Mutex<TimerState>>,
}
struct TimerState {
expired: bool,
waker: Option<Waker>,
}
impl TimerFuture {
fn new(duration: Duration) -> Self {
let shared = Arc::new(Mutex::new(TimerState {
expired: false,
waker: None,
}));
// Spawn a thread that sleeps then wakes the task
let s = shared.clone();
thread::spawn(move || {
thread::sleep(duration);
let mut state = s.lock().unwrap();
state.expired = true;
if let Some(w) = state.waker.take() {
w.wake(); // notify executor
}
});
TimerFuture { shared }
}
}
impl Future for TimerFuture {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
let mut state = self.shared.lock().unwrap();
if state.expired {
Poll::Ready(())
} else {
state.waker = Some(cx.waker().clone());
Poll::Pending
}
}
}
// Usage:
// TimerFuture::new(Duration::from_secs(2)).await;
Example 2: Combining Two Futures (Race)
// A future that resolves to whichever of two futures finishes first
struct Race<A, B> {
a: A,
b: B,
}
impl<A, B, T> Future for Race<A, B>
where
A: Future<Output = T> + Unpin,
B: Future<Output = T> + Unpin,
{
type Output = T;
fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<T> {
// Try polling A first
if let Poll::Ready(val) = Pin::new(&mut self.a).poll(cx) {
return Poll::Ready(val);
}
// A not ready — try B
if let Poll::Ready(val) = Pin::new(&mut self.b).poll(cx) {
return Poll::Ready(val);
}
// Neither ready — both have registered wakers via cx
Poll::Pending
}
}
Building a Minimal Executor
To see how the poll model works end-to-end, here's a simplified single-threaded executor:
use std::collections::VecDeque;
use std::future::Future;
use std::pin::Pin;
use std::sync::{Arc, Mutex};
use std::task::{Context, Poll, Wake};
// A task is a pinned, boxed future
type Task = Pin<Box<dyn Future<Output = ()>>>;
// Simple waker that adds the task back to the queue
struct SimpleWaker {
queue: Arc<Mutex<VecDeque<Task>>>,
task_index: usize,
}
impl Wake for SimpleWaker {
fn wake(self: Arc<Self>) {
// Re-queue the task (in a real executor)
}
}
// The run loop — the heart of any executor
fn block_on<F: Future<Output = ()>>(fut: F) {
let mut fut = Box::pin(fut);
let waker = Arc::new(SimpleWaker {
queue: Arc::new(Mutex::new(VecDeque::new())),
task_index: 0,
});
let waker = Waker::from(waker);
let mut cx = Context::from_waker(&waker);
// Keep polling until the future completes
loop {
match fut.as_mut().poll(&mut cx) {
Poll::Ready(()) => return, // done!
Poll::Pending => {
// In a real executor: park thread, wait for waker
// Here: just busy-loop (don't do this in production!)
std::thread::yield_now();
}
}
}
}
Poll Rules & Common Mistakes
| Rule | Why | Mistake |
|---|---|---|
Must register waker before returning Pending | Executor won't know to repoll otherwise | Task hangs forever |
Don't poll a future after it returns Ready | Undefined behavior (may panic) | FusedFuture trait prevents this |
| Waker may change between polls | Task may be moved to different executor thread | Always use cx.waker() from the latest poll, not a stale one |
wake() must be idempotent | Multiple wakes for same event are possible | Don't assume wake = exactly-once |
| Don't block inside poll | Blocks the entire executor thread | Use spawn_blocking for blocking work |
- Use
async/awaitfor 99% of async code — the compiler does the hard work - Implement
Futuremanually when: building combinators (likeRaceabove), writing zero-alloc futures, creating leaf futures that interface with OS I/O, or building custom executors/runtimes
54. Traits Deep Dive Core
Traits are Rust's mechanism for shared behavior — similar to interfaces in other languages, but far more powerful. A trait defines a set of methods that a type can implement. Traits enable polymorphism, operator overloading, and the entire generics system.
54.1 Defining and Implementing Traits
// Define a trait with required and provided (default) methods
trait Summary {
// Required — every implementor MUST provide this
fn summarize_author(&self) -> String;
// Provided (default implementation) — implementors CAN override
fn summarize(&self) -> String {
format!("(Read more from {}...)", self.summarize_author())
}
}
struct Article {
title: String,
author: String,
content: String,
}
impl Summary for Article {
fn summarize_author(&self) -> String {
self.author.clone()
}
// summarize() uses the default implementation
}
struct Tweet {
username: String,
content: String,
}
impl Summary for Tweet {
fn summarize_author(&self) -> String {
format!("@{}", self.username)
}
// Override the default
fn summarize(&self) -> String {
format!("{}: {}", self.username, self.content)
}
}
54.2 Trait Bounds and Where Clauses
// Trait bounds — three equivalent syntaxes
// 1. impl Trait (sugar, most concise)
fn notify(item: &impl Summary) {
println!("Breaking: {}", item.summarize());
}
// 2. Trait bound syntax (explicit generic)
fn notify<T: Summary>(item: &T) {
println!("Breaking: {}", item.summarize());
}
// 3. Where clause (cleaner for complex bounds)
fn notify<T>(item: &T)
where
T: Summary + Display + Clone,
{
println!("Breaking: {}", item.summarize());
}
// Multiple bounds with + syntax
fn process<T: Summary + Clone + Debug>(item: T) { /* ... */ }
// Multiple generics with where clause
fn transfer<S, D>(src: &S, dst: &mut D)
where
S: Serialize + Debug,
D: Deserialize + Default,
{
/* ... */
}
// Return impl Trait — caller doesn't know the concrete type
fn make_summarizable() -> impl Summary {
Tweet { username: "bot".into(), content: "hello".into() }
}
// Conditional method implementation
struct Pair<T> { x: T, y: T }
impl<T> Pair<T> {
fn new(x: T, y: T) -> Self { Pair { x, y } }
}
// This method only exists when T: Display + PartialOrd
impl<T: Display + PartialOrd> Pair<T> {
fn cmp_display(&self) {
if self.x >= self.y {
println!("Largest is x = {}", self.x);
} else {
println!("Largest is y = {}", self.y);
}
}
}
54.3 Blanket Implementations
Blanket impls provide a trait implementation for all types matching a bound. The standard library uses these extensively.
// Standard library example: ToString for anything implementing Display
impl<T: Display> ToString for T {
fn to_string(&self) -> String {
format!("{}", self)
}
}
// Because of this, every Display type automatically gets .to_string()
// Standard library: Into<U> is auto-implemented when From<T> exists
impl<T, U> Into<U> for T
where
U: From<T>,
{
fn into(self) -> U {
U::from(self)
}
}
// Your own blanket impl
trait Printable {
fn print(&self);
}
impl<T: Debug> Printable for T {
fn print(&self) {
println!("{:?}", self);
}
}
// Now EVERY Debug type has a .print() method!
54.4 Associated Types vs Generic Traits
// Associated type — ONE implementation per type
trait Iterator {
type Item; // associated type
fn next(&mut self) -> Option<Self::Item>;
}
// Can only impl Iterator once for Counter (Item is fixed)
struct Counter { count: u32 }
impl Iterator for Counter {
type Item = u32;
fn next(&mut self) -> Option<u32> {
self.count += 1;
if self.count < 6 { Some(self.count) } else { None }
}
}
// Generic trait — MULTIPLE implementations per type
trait ConvertTo<T> {
fn convert(&self) -> T;
}
struct Celsius(f64);
// Implement conversion to multiple types!
impl ConvertTo<f64> for Celsius {
fn convert(&self) -> f64 { self.0 * 1.8 + 32.0 } // to Fahrenheit
}
impl ConvertTo<String> for Celsius {
fn convert(&self) -> String { format!("{}°C", self.0) }
}
// When to use which?
// Associated type: there's ONE natural implementation (Iterator, Deref)
// Generic trait: you want MULTIPLE implementations (From<T>, Add<Rhs>)
54.5 Supertraits (Trait Inheritance)
// A supertrait is a trait that REQUIRES another trait
trait Animal: Display + Debug {
fn name(&self) -> &str;
fn sound(&self) -> &str;
}
// Any type implementing Animal MUST also implement Display + Debug
use std::fmt;
#[derive(Debug)]
struct Dog { name: String }
impl fmt::Display for Dog {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "Dog({})", self.name)
}
}
impl Animal for Dog {
fn name(&self) -> &str { &self.name }
fn sound(&self) -> &str { "Woof" }
}
// Supertrait chain
trait Shape: Display {
fn area(&self) -> f64;
}
trait Drawable: Shape {
fn draw(&self);
}
// Drawable requires Shape, which requires Display
// Implementors must satisfy ALL three traits
54.6 Trait Objects, Static vs Dynamic Dispatch
// Heterogeneous collection — only possible with dyn
let shapes: Vec<Box<dyn Shape>> = vec![
Box::new(Circle { radius: 5.0 }),
Box::new(Rect { w: 3.0, h: 4.0 }),
];
for s in &shapes {
println!("Area: {}", s.area()); // vtable dispatch each call
}
54.7 Object Safety Rules
Not every trait can become a dyn Trait. A trait is object-safe if:
| Rule | Why | Example Violation |
|---|---|---|
All methods have &self, &mut self, or self receiver | Vtable needs a concrete object to call on | fn new() -> Self (no receiver) |
No methods return Self | Compiler doesn't know Self's size behind dyn | fn clone(&self) -> Self |
| No generic type parameters on methods | Can't monomorphize through a vtable | fn process<T>(&self, t: T) |
Trait doesn't require Self: Sized | dyn Trait is unsized | trait Foo: Sized { } |
// Workaround: use where Self: Sized to exclude methods from vtable
trait Cloneable {
fn clone_box(&self) -> Box<dyn Cloneable>;
// This method is excluded from the vtable
fn static_only(&self) where Self: Sized {
// only callable on concrete types, not dyn Cloneable
}
}
54.8 Trait Upcasting (Rust 1.86+)
// Trait upcasting: convert dyn Subtrait → dyn Supertrait
trait Base {
fn base_method(&self);
}
trait Derived: Base {
fn derived_method(&self);
}
fn takes_base(obj: &dyn Base) {
obj.base_method();
}
fn example(obj: &dyn Derived) {
// Before Rust 1.86: compiler error!
// After Rust 1.86: works via trait upcasting
takes_base(obj); // &dyn Derived → &dyn Base ✅
}
54.9 Advanced Trait Patterns
// 1. Newtype pattern — implement foreign trait for foreign type
struct Wrapper(Vec<String>);
impl fmt::Display for Wrapper {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "[{}]", self.0.join(", "))
}
}
// Now Vec<String> (via Wrapper) has Display!
// 2. Extension trait — add methods to foreign types
trait VecExt<T> {
fn median(&self) -> Option<&T>;
}
impl<T: Ord> VecExt<T> for Vec<T> {
fn median(&self) -> Option<&T> {
if self.is_empty() { return None; }
Some(&self[self.len() / 2])
}
}
// 3. Marker traits — no methods, just a type-level flag
trait Validated {} // marker: this type has been validated
struct Email<S> { address: String, _state: std::marker::PhantomData<S> }
struct Unvalidated;
struct Valid;
impl Validated for Valid {}
// Only validated emails can be sent
fn send(email: Email<Valid>) { /* ... */ }
// 4. Operator overloading via traits
use std::ops::Add;
#[derive(Debug, Clone, Copy)]
struct Point { x: f64, y: f64 }
impl Add for Point {
type Output = Point;
fn add(self, other: Point) -> Point {
Point { x: self.x + other.x, y: self.y + other.y }
}
}
let p = Point { x: 1.0, y: 2.0 } + Point { x: 3.0, y: 4.0 };
// p = Point { x: 4.0, y: 6.0 }
// 5. Fully Qualified Syntax (disambiguation)
trait Pilot { fn fly(&self); }
trait Wizard { fn fly(&self); }
struct Human;
impl Pilot for Human {
fn fly(&self) { println!("This is your captain"); }
}
impl Wizard for Human {
fn fly(&self) { println!("Up!"); }
}
let person = Human;
Pilot::fly(&person); // "This is your captain"
Wizard::fly(&person); // "Up!"
// For associated functions (no &self), use fully qualified syntax:
// <Type as Trait>::function()
54.10 Common Derivable Traits
| Derive | What it gives you | Requirement |
|---|---|---|
#[derive(Debug)] | {:?} formatting | All fields must impl Debug |
#[derive(Clone)] | .clone() deep copy | All fields must impl Clone |
#[derive(Copy)] | Implicit copy on assignment | All fields must impl Copy (+ Clone) |
#[derive(PartialEq)] | == and != | All fields must impl PartialEq |
#[derive(Eq)] | Total equality (reflexive) | Must also derive PartialEq |
#[derive(PartialOrd)] | <, >, <=, >= | Must also derive PartialEq |
#[derive(Ord)] | Total ordering (for sorting) | Must also derive Eq + PartialOrd |
#[derive(Hash)] | Hashable (for HashMap keys) | All fields must impl Hash |
#[derive(Default)] | Type::default() | All fields must impl Default |
// Derive multiple traits at once
#[derive(Debug, Clone, PartialEq, Eq, Hash, Default)]
struct Config {
name: String,
retries: u32,
verbose: bool,
}
// Now you can: debug print, clone, compare, hash, and default-construct
let c1 = Config::default();
let c2 = c1.clone();
assert_eq!(c1, c2);
use std::collections::HashSet;
let mut set = HashSet::new();
set.insert(c1); // works because Config: Hash + Eq
55. Pin & Poll Explained Advanced
Pin and Poll are the two core primitives underlying Rust's async system. Understanding them deeply is essential for advanced async programming, custom futures, and runtime development.
55.1 What Problem Does Pin Solve?
Rust's async/await compiles into state machines that may contain self-referential data: a future holds both a value and a reference to that value across .await points. If the future were moved to a different memory location, the internal reference would become a dangling pointer.
55.2 Pin<P> — The Guarantee
Pin<P> wraps a pointer type P (like &mut T, Box<T>) and guarantees that the pointed-to value will not be moved. This makes self-references safe.
use std::pin::Pin;
// Pin wraps a pointer, NOT the value itself
// Pin<&mut T> — pinned mutable reference
// Pin<Box<T>> — pinned heap allocation
// Pin<Arc<T>> — pinned shared pointer
// Creating pinned values
let mut val = 42;
let pinned: Pin<&mut i32> = Pin::new(&mut val);
// i32 is Unpin, so this is allowed and the value can still be moved
// Box::pin — the most common way to pin a value
let pinned: Pin<Box<String>> = Box::pin(String::from("hello"));
// String is on the heap, address is stable
// Accessing pinned data
let pinned = Box::pin(vec![1, 2, 3]);
let r: &Vec<i32> = &*pinned; // shared ref: always OK
println!("len = {}", pinned.len()); // Deref: always OK
55.3 Unpin — The Escape Hatch
// Most types are Unpin (auto-trait). Unpin means:
// "I don't have self-references, so I'm safe to move even when pinned"
// Unpin types (safe to move):
// i32, f64, String, Vec<T>, HashMap, Box<T>, &T, ... almost everything
// !Unpin types (NOT safe to move once pinned):
// - compiler-generated async state machines (futures)
// - types containing PhantomPinned
// - types with explicit `impl !Unpin` (nightly)
use std::marker::PhantomPinned;
struct Immovable {
data: String,
_pin: PhantomPinned, // opts out of Unpin
}
// For Unpin types, Pin is essentially transparent:
let mut x = String::from("hello");
let pinned = Pin::new(&mut x);
let unpinned: &mut String = Pin::into_inner(pinned); // ✅ Unpin
// For !Unpin types, you CANNOT get &mut T from Pin<&mut T> safely
// (only via unsafe get_unchecked_mut)
Pin only matters for !Unpin types. For Unpin types (which is almost everything), Pin is a no-op wrapper. The reason Pin appears everywhere in async code is that compiler-generated futures are !Unpin.
55.4 Pin Projection
When you have a Pin<&mut MyStruct>, how do you access individual fields? This is called projection.
// Manual projection (unsafe!)
struct MyFuture {
// This field must stay pinned (it's a future itself)
inner: SomeInnerFuture,
// This field is safe to move (just a counter)
count: u32,
}
impl MyFuture {
// Pin-project to the inner future (structural pinning)
fn inner(self: Pin<&mut Self>) -> Pin<&mut SomeInnerFuture> {
unsafe { self.map_unchecked_mut(|s| &mut s.inner) }
}
// Non-pinned field: safe to access normally
fn count(self: Pin<&mut Self>) -> &mut u32 {
unsafe { &mut self.get_unchecked_mut().count }
}
}
// Safe alternative: use the pin-project crate!
use pin_project::pin_project;
#[pin_project]
struct MyFuture {
#[pin] // this field is structurally pinned
inner: SomeInnerFuture,
count: u32, // this field is NOT pinned
}
// pin-project generates safe projection methods:
fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<()> {
let this = self.project(); // safe!
// this.inner: Pin<&mut SomeInnerFuture> (pinned)
// this.count: &mut u32 (not pinned)
*this.count += 1;
this.inner.poll(cx)
}
55.5 Poll — The Async Execution Model
Rust's async model is poll-based, not callback-based. A future doesn't "notify" you when done — instead, the runtime polls the future, and the future says "ready" or "not yet".
// The core enum
enum Poll<T> {
Ready(T), // computation is complete, here's the result
Pending, // not done yet, call me again later
}
// The Future trait — every async fn compiles to this
trait Future {
type Output;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
// ▲ ▲
// │ │
// Pin: future can't move Context: contains the Waker
55.6 Context and Waker
// Context carries the Waker — the "callback" that tells the
// executor "hey, this future is ready to be polled again"
use std::task::{Context, Waker, Poll};
use std::future::Future;
use std::pin::Pin;
// Inside a custom future's poll():
impl Future for MyFuture {
type Output = String;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<String> {
if self.is_ready() {
Poll::Ready(self.get_result())
} else {
// Clone the waker and register it with the I/O source
let waker: Waker = cx.waker().clone();
// When I/O completes, the I/O source calls waker.wake()
// which tells the executor to poll this future again
self.register_waker(waker);
Poll::Pending
}
}
}
// IMPORTANT RULES:
// 1. If you return Pending, you MUST have arranged for wake() to be called
// Otherwise: the future is never polled again → deadlock!
// 2. wake() may be called from ANY thread (it's Send + Sync)
// 3. Spurious wakes are OK — poll() might return Pending again
// 4. Always use the LATEST waker from cx (the executor may change it)
55.7 Custom Future Example: Sleep Timer
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
use std::time::{Duration, Instant};
struct Sleep {
deadline: Instant,
}
impl Sleep {
fn new(duration: Duration) -> Self {
Sleep { deadline: Instant::now() + duration }
}
}
impl Future for Sleep {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
if Instant::now() >= self.deadline {
Poll::Ready(())
} else {
// Schedule a wake-up (in a real runtime, this would
// register with a timer wheel)
let waker = cx.waker().clone();
let deadline = self.deadline;
std::thread::spawn(move || {
std::thread::sleep(deadline - Instant::now());
waker.wake(); // tell executor to poll us again
});
Poll::Pending
}
}
}
// Usage: Sleep::new(Duration::from_secs(1)).await;
55.8 Combining Pin & Poll: Future Combinators
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
use pin_project::pin_project;
/// Runs two futures concurrently, returns whichever finishes first
#[pin_project]
struct Race<A, B> {
#[pin] a: A,
#[pin] b: B,
}
impl<T, A, B> Future for Race<A, B>
where
A: Future<Output = T>,
B: Future<Output = T>,
{
type Output = T;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<T> {
let this = self.project(); // safe pin projection
// Poll A first
if let Poll::Ready(val) = this.a.poll(cx) {
return Poll::Ready(val);
}
// Then poll B
if let Poll::Ready(val) = this.b.poll(cx) {
return Poll::Ready(val);
}
// Neither ready
Poll::Pending
}
}
/// Map the output of a future
#[pin_project]
struct Map<Fut, F> {
#[pin] future: Fut,
f: Option<F>, // Option so we can take() it once
}
impl<Fut, F, T> Future for Map<Fut, F>
where
Fut: Future,
F: FnOnce(Fut::Output) -> T,
{
type Output = T;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<T> {
let this = self.project();
match this.future.poll(cx) {
Poll::Ready(output) => {
let f = this.f.take().expect("polled after completion");
Poll::Ready(f(output))
}
Poll::Pending => Poll::Pending,
}
}
}
55.9 Summary: Pin & Poll Mental Model
async/await for 99% of code — Pin and Poll are handled automatically. (2) Implement Future manually only for custom combinators, leaf futures, or executors. (3) Use pin-project crate for safe pin projection. (4) Always register a waker before returning Pending. (5) Pin only constrains !Unpin types — for most types it's a no-op.
56. Tokio Runtime Advanced
Tokio is Rust's most widely-used async runtime. It provides a multi-threaded, work-stealing scheduler, async I/O primitives (TCP, UDP, Unix sockets, files), timers, channels, and synchronization utilities. Nearly every async Rust project in production depends on Tokio.
56.1 Runtime Setup
// The #[tokio::main] macro — most common entry point
#[tokio::main]
async fn main() {
println!("Running on Tokio!");
}
// Expands to:
// fn main() {
// tokio::runtime::Runtime::new().unwrap().block_on(async { ... })
// }
// Flavor options
#[tokio::main(flavor = "multi_thread", worker_threads = 4)]
async fn main() { /* 4 worker threads */ }
#[tokio::main(flavor = "current_thread")]
async fn main() { /* single-threaded — no Send required */ }
// Manual runtime construction (for libraries or advanced config)
fn main() {
let rt = tokio::runtime::Builder::new_multi_thread()
.worker_threads(8)
.enable_all() // enable I/O + time drivers
.thread_name("my-worker")
.thread_stack_size(3 * 1024 * 1024) // 3 MB per thread
.build()
.unwrap();
rt.block_on(async {
println!("Custom runtime!");
});
}
// For tests
#[tokio::test]
async fn test_something() {
let result = my_async_fn().await;
assert_eq!(result, 42);
}
56.2 Spawning Tasks
use tokio::task;
// tokio::spawn — run a future concurrently on the runtime
// Returns JoinHandle<T> to await the result
#[tokio::main]
async fn main() {
// Spawned tasks run independently — they're like lightweight threads
let handle = tokio::spawn(async {
// This runs on any worker thread
expensive_computation().await
});
// Do other work while task runs...
do_something_else().await;
// Await the spawned task's result
let result = handle.await.unwrap(); // JoinError if task panicked
// IMPORTANT: spawned futures must be Send + 'static
// because they may move between worker threads
}
// spawn_blocking — run blocking/CPU-heavy code without starving the runtime
let hash = task::spawn_blocking(|| {
// This runs on a dedicated blocking thread pool
compute_hash(&large_data) // CPU-intensive, blocks the thread
}).await.unwrap();
// spawn_local — run !Send futures on the current thread
let local = task::LocalSet::new();
local.run_until(async {
task::spawn_local(async {
// Can use Rc, RefCell, etc. (non-Send types)
let rc = std::rc::Rc::new(42);
some_async_fn().await;
println!("{}", rc);
}).await.unwrap();
}).await;
// JoinSet — manage a dynamic set of spawned tasks
use tokio::task::JoinSet;
let mut set = JoinSet::new();
for i in 0..10 {
set.spawn(async move { fetch_url(i).await });
}
// Collect results as they complete (order is NOT guaranteed)
while let Some(result) = set.join_next().await {
match result {
Ok(value) => println!("Got: {value}"),
Err(e) => eprintln!("Task failed: {e}"),
}
}
std::thread::sleep(), std::fs::read(), or any blocking I/O inside an async task starves other tasks. Use tokio::time::sleep(), tokio::fs::read(), or spawn_blocking() instead.
56.3 Async I/O
use tokio::net::{TcpListener, TcpStream};
use tokio::io::{AsyncReadExt, AsyncWriteExt};
// TCP server
#[tokio::main]
async fn main() -> std::io::Result<()> {
let listener = TcpListener::bind("127.0.0.1:8080").await?;
println!("Listening on :8080");
loop {
let (socket, addr) = listener.accept().await?;
println!("New connection from {addr}");
// Spawn a task per connection — handles thousands concurrently
tokio::spawn(async move {
handle_connection(socket).await;
});
}
}
async fn handle_connection(mut socket: TcpStream) {
let mut buf = [0u8; 1024];
loop {
let n = socket.read(&mut buf).await.unwrap();
if n == 0 { return; } // connection closed
socket.write_all(&buf[..n]).await.unwrap(); // echo back
}
}
// TCP client
async fn connect() -> std::io::Result<()> {
let mut stream = TcpStream::connect("127.0.0.1:8080").await?;
stream.write_all(b"Hello!").await?;
let mut buf = String::new();
stream.read_to_string(&mut buf).await?;
println!("Server said: {buf}");
Ok(())
}
// File I/O (tokio::fs)
use tokio::fs;
let contents = fs::read_to_string("config.toml").await?;
fs::write("output.txt", "hello").await?;
56.4 Timers and Timeouts
use tokio::time::{sleep, timeout, interval, Duration, Instant};
// sleep — non-blocking delay
sleep(Duration::from_secs(2)).await;
// timeout — wrap any future with a deadline
match timeout(Duration::from_secs(5), slow_operation()).await {
Ok(result) => println!("Completed: {result:?}"),
Err(_) => println!("Timed out after 5 seconds!"),
}
// interval — periodic timer
let mut ticker = interval(Duration::from_secs(1));
loop {
ticker.tick().await; // first tick completes immediately
println!("Tick at {:?}", Instant::now());
}
// sleep_until — sleep until a specific instant
let deadline = Instant::now() + Duration::from_secs(10);
tokio::time::sleep_until(deadline).await;
56.5 Channels (Tokio's async message passing)
// mpsc — multi-producer, single-consumer (most common)
use tokio::sync::mpsc;
let (tx, mut rx) = mpsc::channel::<String>(100); // bounded, capacity 100
// Producer
let tx2 = tx.clone(); // clone for multiple producers
tokio::spawn(async move {
tx.send("hello".into()).await.unwrap();
tx.send("world".into()).await.unwrap();
});
tokio::spawn(async move {
tx2.send("from task 2".into()).await.unwrap();
});
// Consumer
while let Some(msg) = rx.recv().await {
println!("Got: {msg}");
}
// oneshot — single value, single use (like a promise)
use tokio::sync::oneshot;
let (tx, rx) = oneshot::channel();
tokio::spawn(async move {
let result = compute_something().await;
tx.send(result).unwrap();
});
let value = rx.await.unwrap(); // wait for the one result
// broadcast — multi-producer, multi-consumer
use tokio::sync::broadcast;
let (tx, _) = broadcast::channel::<String>(16);
let mut rx1 = tx.subscribe();
let mut rx2 = tx.subscribe();
tx.send("event".into()).unwrap();
// Both rx1 and rx2 receive "event"
// watch — single-producer, multi-consumer, only latest value
use tokio::sync::watch;
let (tx, mut rx) = watch::channel("initial".to_string());
tokio::spawn(async move {
while rx.changed().await.is_ok() {
println!("Config updated: {}", *rx.borrow());
}
});
tx.send("updated value".into()).unwrap();
mpsc — work distribution (fan-out). oneshot — single async result (request/response). broadcast — pub/sub (events). watch — config changes (latest-value-wins).
56.6 Synchronization Primitives
use tokio::sync::{Mutex, RwLock, Semaphore, Notify};
// tokio::sync::Mutex — async-aware mutex (holds lock across .await)
let data = Arc::new(tokio::sync::Mutex::new(vec![1, 2, 3]));
{
let mut guard = data.lock().await; // async lock acquisition
guard.push(4);
some_async_call().await; // safe to hold across .await
} // lock released
// When to use tokio::sync::Mutex vs std::sync::Mutex:
// tokio::sync::Mutex — when you need to hold lock across .await points
// std::sync::Mutex — for short, synchronous critical sections (faster)
// Semaphore — limit concurrent access
let sem = Arc::new(Semaphore::new(10)); // max 10 concurrent
for url in urls {
let permit = sem.clone().acquire_owned().await.unwrap();
tokio::spawn(async move {
fetch(url).await;
drop(permit); // release slot
});
}
// Notify — signal between tasks (like a condition variable)
let notify = Arc::new(Notify::new());
let n = notify.clone();
tokio::spawn(async move {
do_work().await;
n.notify_one(); // signal the waiter
});
notify.notified().await; // wait for signal
56.7 select! — Racing Futures
use tokio::select;
use tokio::time::{sleep, Duration};
// select! polls multiple futures, runs the branch of the FIRST to complete
async fn race_example() {
let mut interval = tokio::time::interval(Duration::from_secs(1));
let (tx, mut rx) = tokio::sync::mpsc::channel::<String>(10);
loop {
select! {
// Branch 1: message received
Some(msg) = rx.recv() => {
println!("Message: {msg}");
}
// Branch 2: timer ticked
_ = interval.tick() => {
println!("Heartbeat");
}
// Branch 3: shutdown signal
_ = tokio::signal::ctrl_c() => {
println!("Shutting down...");
break;
}
}
}
}
// Cancellation safety warning:
// When one branch wins, the OTHER futures are DROPPED (cancelled).
// Some futures are NOT cancellation-safe:
// ❌ tokio::io::AsyncReadExt::read() — may lose partially-read data
// ✅ mpsc::Receiver::recv() — safe to cancel
// ✅ oneshot::Receiver — safe to cancel
// Always check docs for cancellation safety!
56.8 Graceful Shutdown Pattern
use tokio::sync::watch;
use tokio::signal;
#[tokio::main]
async fn main() {
// Shutdown signal channel
let (shutdown_tx, shutdown_rx) = watch::channel(false);
// Spawn worker tasks with shutdown receiver
for i in 0..4 {
let mut rx = shutdown_rx.clone();
tokio::spawn(async move {
loop {
tokio::select! {
_ = do_work(i) => {},
_ = rx.changed() => {
println!("Worker {i} shutting down");
break;
}
}
}
});
}
// Wait for Ctrl+C
signal::ctrl_c().await.unwrap();
println!("Shutdown signal received");
shutdown_tx.send(true).unwrap();
// Give tasks time to clean up
tokio::time::sleep(Duration::from_secs(2)).await;
}
56.9 Common Patterns & Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Blocking in async task | All tasks freeze / slow down | Use spawn_blocking() or async alternatives |
Holding std::sync::MutexGuard across .await | Deadlock or Send errors | Use tokio::sync::Mutex or scope the guard |
| Unbounded channels | Memory grows without limit | Use bounded channels with backpressure |
Forgetting to .await spawned JoinHandle | Silent task failures | Always handle JoinHandle or use JoinSet |
| CPU-bound work on runtime | Latency spikes for all tasks | Use spawn_blocking() or rayon |
Using Rc in spawned task | Compiler error: Rc is not Send | Use Arc or spawn_local |
57. Tonic — gRPC in Rust Advanced
Tonic is Rust's premier gRPC framework, built on top of Tokio and Hyper. It uses Protocol Buffers (protobuf) for service definition and code generation, providing strongly-typed, high-performance RPC with support for streaming, TLS, interceptors, and load balancing.
57.1 What is gRPC?
57.2 Project Setup
# Cargo.toml
[dependencies]
tonic = "0.12"
prost = "0.13" # protobuf runtime
tokio = { version = "1", features = ["full"] }
[build-dependencies]
tonic-build = "0.12" # protobuf code generator
# Install protoc (protobuf compiler):
# macOS: brew install protobuf
# Ubuntu: apt install protobuf-compiler
# Or set PROTOC env var to the protoc binary path
57.3 Define the Service (Protobuf)
// proto/greeter.proto
syntax = "proto3";
package greeter;
// Service definition — generates Rust traits
service Greeter {
// Unary RPC
rpc SayHello (HelloRequest) returns (HelloReply);
// Server-side streaming
rpc SayHelloStream (HelloRequest) returns (stream HelloReply);
}
message HelloRequest {
string name = 1;
}
message HelloReply {
string message = 1;
}
57.4 Build Script
// build.rs — runs at compile time, generates Rust code from .proto
fn main() -> Result<(), Box<dyn std::error::Error>> {
tonic_build::compile_protos("proto/greeter.proto")?;
Ok(())
}
// Generates in target/: greeter.rs containing:
// - HelloRequest, HelloReply structs (from prost)
// - greeter_server::Greeter trait (to implement)
// - greeter_server::GreeterServer (to serve)
// - greeter_client::GreeterClient (to call)
57.5 Implementing the Server
use tonic::{transport::Server, Request, Response, Status};
// Include the generated code
pub mod greeter {
tonic::include_proto!("greeter");
}
use greeter::greeter_server::{Greeter, GreeterServer};
use greeter::{HelloRequest, HelloReply};
// Implement the service trait
#[derive(Debug, Default)]
struct MyGreeter {}
#[tonic::async_trait]
impl Greeter for MyGreeter {
// Unary RPC handler
async fn say_hello(
&self,
request: Request<HelloRequest>,
) -> Result<Response<HelloReply>, Status> {
let name = &request.into_inner().name;
println!("Got request from: {name}");
let reply = HelloReply {
message: format!("Hello, {name}!"),
};
Ok(Response::new(reply))
}
// Server streaming RPC
type SayHelloStreamStream =
tokio_stream::wrappers::ReceiverStream<Result<HelloReply, Status>>;
async fn say_hello_stream(
&self,
request: Request<HelloRequest>,
) -> Result<Response<Self::SayHelloStreamStream>, Status> {
let name = request.into_inner().name;
let (tx, rx) = tokio::sync::mpsc::channel(4);
tokio::spawn(async move {
for i in 0..5 {
let reply = HelloReply {
message: format!("Hello {name} #{i}"),
};
tx.send(Ok(reply)).await.unwrap();
tokio::time::sleep(std::time::Duration::from_millis(500)).await;
}
});
Ok(Response::new(tokio_stream::wrappers::ReceiverStream::new(rx)))
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let addr = "[::1]:50051".parse()?;
let greeter = MyGreeter::default();
println!("gRPC server listening on {addr}");
Server::builder()
.add_service(GreeterServer::new(greeter))
.serve(addr)
.await?;
Ok(())
}
57.6 Client
use greeter::greeter_client::GreeterClient;
use greeter::HelloRequest;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut client = GreeterClient::connect("http://[::1]:50051").await?;
// Unary call
let request = tonic::Request::new(HelloRequest {
name: "Rust".into(),
});
let response = client.say_hello(request).await?;
println!("Response: {}", response.into_inner().message);
// Server streaming call
let request = tonic::Request::new(HelloRequest {
name: "Stream".into(),
});
let mut stream = client.say_hello_stream(request).await?.into_inner();
while let Some(reply) = stream.message().await? {
println!("Streaming: {}", reply.message);
}
Ok(())
}
57.7 Interceptors & Metadata
use tonic::{Request, Status};
// Server-side interceptor (middleware)
fn auth_interceptor(req: Request<()>) -> Result<Request<()>, Status> {
match req.metadata().get("authorization") {
Some(token) if token == "Bearer secret" => Ok(req),
_ => Err(Status::unauthenticated("Invalid token")),
}
}
// Apply interceptor to server
Server::builder()
.add_service(GreeterServer::with_interceptor(
MyGreeter::default(),
auth_interceptor,
))
.serve(addr)
.await?;
// Client-side: attach metadata to requests
let mut request = tonic::Request::new(HelloRequest { name: "auth".into() });
request.metadata_mut().insert(
"authorization",
"Bearer secret".parse().unwrap(),
);
let response = client.say_hello(request).await?;
57.8 TLS & Advanced Config
use tonic::transport::{Server, Identity, ServerTlsConfig, Certificate};
// Server with TLS
let cert = tokio::fs::read("server.pem").await?;
let key = tokio::fs::read("server.key").await?;
let identity = Identity::from_pem(cert, key);
Server::builder()
.tls_config(ServerTlsConfig::new().identity(identity))?
.add_service(GreeterServer::new(MyGreeter::default()))
.serve(addr)
.await?;
// Client with TLS
use tonic::transport::{Channel, ClientTlsConfig};
let ca_cert = tokio::fs::read("ca.pem").await?;
let tls = ClientTlsConfig::new()
.ca_certificate(Certificate::from_pem(ca_cert))
.domain_name("example.com");
let channel = Channel::from_static("https://example.com:50051")
.tls_config(tls)?
.connect()
.await?;
let client = GreeterClient::new(channel);
tonic-build (code gen), tonic-reflection (gRPC server reflection), tonic-health (health checking), tonic-web (gRPC-Web support for browsers), tower middleware (rate limiting, timeouts, load balancing).
58. Popular Rust Libraries Reference
The Rust ecosystem is rich with high-quality crates. Here's a curated guide to the most widely-used libraries, organized by domain.
🌐 Web Frameworks
| Crate | Description | Key Feature |
|---|---|---|
axum | Ergonomic, modular web framework built on Tokio + Tower + Hyper | Type-safe extractors, Tower middleware, first-party Tokio team |
actix-web | Powerful, blazing fast web framework using actor model | Top benchmark performer, mature, WebSocket support |
warp | Composable web framework based on filter combinators | Functional-style routing, built-in WebSocket |
rocket | Developer-friendly framework with attribute-based routing | Codegen-driven, form handling, templating, managed state |
poem | Full-featured async web framework | OpenAPI integration, built-in middleware, elegant API |
📡 HTTP & Networking
| Crate | Description | Key Feature |
|---|---|---|
reqwest | High-level HTTP client (async and blocking) | TLS, JSON, cookies, proxy, multipart, redirect handling |
hyper | Low-level HTTP/1 and HTTP/2 implementation | Foundation for axum, tonic, reqwest; zero-cost abstractions |
tower | Modular middleware framework for async services | Rate limiting, timeouts, load balancing, retry, concurrency limit |
rustls | Pure-Rust TLS implementation (no OpenSSL) | Memory-safe TLS, no C dependencies, FIPS support |
quinn | QUIC protocol implementation | HTTP/3 foundation, low-latency UDP-based transport |
🗄️ Database & ORM
| Crate | Description | Key Feature |
|---|---|---|
sqlx | Async SQL toolkit with compile-time checked queries | Compile-time SQL verification, Postgres/MySQL/SQLite, no ORM overhead |
diesel | Safe, extensible ORM and query builder | Compile-time query safety, migrations, type-safe schema |
sea-orm | Async ORM built on sqlx | ActiveRecord pattern, code generation, migrations |
rusqlite | Ergonomic SQLite bindings | Synchronous API, bundled SQLite, full-text search |
redis | Redis client (async + sync) | Cluster, pub/sub, streams, Lua scripting |
mongodb | Official MongoDB async driver | BSON, aggregation, transactions, change streams |
📦 Serialization & Data
| Crate | Description | Key Feature |
|---|---|---|
serde | Serialization/deserialization framework | Derive macros, zero-copy, ecosystem standard |
serde_json | JSON support for serde | Streaming, Value type, raw JSON |
toml | TOML parser/serializer | Rust's config format, serde integration |
csv | Fast CSV reader/writer | Serde support, streaming, custom delimiters |
prost | Protocol Buffers implementation | Code generation, used by tonic |
bincode | Compact binary serialization | Fast encoding, good for IPC and caching |
⚡ Async Runtime & Utilities
| Crate | Description | Key Feature |
|---|---|---|
tokio | Async runtime (multi-threaded scheduler, I/O, timers) | Work-stealing, spawn, channels, fs, net, signals |
async-std | Async version of Rust's std library | Familiar API, single-threaded or multi-threaded |
smol | Small, fast async runtime | Minimal footprint, composable with other runtimes |
futures | Foundational async utilities | Stream, Sink, FutureExt, select!, join!, channel |
tokio-stream | Stream utilities for Tokio | StreamExt, wrappers for Tokio types |
async-trait | Async methods in traits (pre-1.75) | Proc macro for async fn in traits |
🔐 Cryptography & Auth
| Crate | Description | Key Feature |
|---|---|---|
ring | Safe, fast cryptographic primitives | AES, SHA, HMAC, RSA, ECDSA, ChaCha20 |
rustls | Pure-Rust TLS library | No C deps, memory-safe, modern TLS |
jsonwebtoken | JWT encoding/decoding | RS256, HS256, ES256, validation |
argon2 | Password hashing | Argon2id (recommended), constant-time |
uuid | UUID generation and parsing | v4 (random), v7 (time-sortable), serde support |
📝 Logging, Tracing & Diagnostics
| Crate | Description | Key Feature |
|---|---|---|
tracing | Structured, async-aware diagnostics | Spans, events, subscribers, async-compatible |
tracing-subscriber | Tracing output formatters | JSON, pretty-print, env filter, layered |
log | Lightweight logging facade | Simple macros (info!, warn!), ecosystem standard |
env_logger | Logger configured via environment variables | RUST_LOG=debug, colored output |
color-eyre | Colorful error reports with context | Beautiful panics, SpanTrace integration |
🖥️ CLI & Terminal
| Crate | Description | Key Feature |
|---|---|---|
clap | Command-line argument parser | Derive macros, subcommands, shell completions, help generation |
dialoguer | Interactive prompts (select, confirm, input) | Multi-select, password input, themes |
indicatif | Progress bars and spinners | Multi-bar, custom templates, ETA |
colored | Terminal colors and styles | Simple API: "text".red().bold() |
ratatui | Terminal UI framework (TUI) | Widgets, layouts, event handling, crossterm backend |
🧪 Testing & Quality
| Crate | Description | Key Feature |
|---|---|---|
tokio-test | Testing utilities for Tokio | Mock I/O, time control, assert_ready! |
mockall | Mocking framework | Automock derive, expectations, sequences |
proptest | Property-based testing | Shrinking, strategies, regression files |
criterion | Statistical benchmarking | Plots, regression detection, stable API |
insta | Snapshot testing | Review mode, inline/file snapshots, redactions |
wiremock | HTTP mocking for integration tests | Request matching, response templates, async |
🧮 Math, Science & Data
| Crate | Description | Key Feature |
|---|---|---|
ndarray | N-dimensional arrays (NumPy-like) | Slicing, broadcasting, BLAS integration |
nalgebra | Linear algebra library | Matrices, quaternions, geometric transforms |
polars | Fast DataFrame library (Pandas-like) | Lazy evaluation, multi-threaded, Arrow-based |
rand | Random number generation | Distributions, thread-local, reproducible |
rayon | Data parallelism (parallel iterators) | .par_iter(), work-stealing, zero-config |
🔧 Error Handling & Utilities
| Crate | Description | Key Feature |
|---|---|---|
anyhow | Flexible error handling for applications | anyhow::Result, context chaining, any error type |
thiserror | Derive macros for custom error types (libraries) | #[error("...")] derive, From conversion |
eyre | Customizable error reporting | Pluggable handlers, color-eyre for rich reports |
once_cell | Lazy initialization (now in std as LazyLock/OnceLock) | Lazy, OnceCell, thread-safe |
itertools | Extra iterator adaptors | .chunks(), .tuple_windows(), .join(), combinations |
bytes | Efficient byte buffer utilities | Bytes (shared), BytesMut (mutable), zero-copy slicing |
dashmap | Concurrent hash map | Lock-free reads, sharded writes, Entry API |
🏗️ Build & Macros
| Crate | Description | Key Feature |
|---|---|---|
proc-macro2 | Procedural macro token streams | Foundation for derive macros |
syn | Rust source code parser | Parse derive input, expressions, items |
quote | Quasi-quoting for code generation | quote! { ... } to generate TokenStream |
cc | C/C++ build tool integration | Compile C files in build.rs |
bindgen | Auto-generate FFI bindings from C headers | Reads .h files, generates unsafe Rust bindings |
🌍 WebAssembly
| Crate | Description | Key Feature |
|---|---|---|
wasm-bindgen | Rust ↔ JavaScript interop for Wasm | Call JS from Rust, export Rust to JS |
web-sys | Raw Web API bindings | DOM, fetch, WebSocket, Canvas, WebGL |
yew | Component-based web framework (like React) | Virtual DOM, hooks, SSR, macro-based HTML |
leptos | Fine-grained reactive web framework | Signals, SSR + hydration, full-stack |
dioxus | Cross-platform UI framework | Web, desktop, mobile from one codebase |
🗃️ Embedded & Systems
| Crate | Description | Key Feature |
|---|---|---|
embedded-hal | Hardware abstraction layer traits | Portable drivers, GPIO, SPI, I2C, UART |
defmt | Efficient logging for embedded | Deferred formatting, tiny binary size |
nix | Unix system call bindings | Safe wrappers for POSIX APIs |
libc | Raw C library bindings | Platform types and constants |
mio | Low-level non-blocking I/O | epoll/kqueue abstraction, foundation for Tokio |
59. Cryptography — ECDSA & secp256k1 Advanced
The Elliptic Curve Digital Signature Algorithm (ECDSA) is the signature scheme behind Bitcoin, Ethereum, and most modern cryptographic identity systems. The curve secp256k1 (used by Bitcoin/Ethereum) is a specific Koblitz curve chosen for efficiency. Rust has excellent crate support for working with these primitives safely and performantly.
59.1 Background: How ECDSA Works
k, the private key can be computed from the two signatures. This is how the PlayStation 3 ECDSA key was broken. Always use deterministic nonce generation (RFC 6979) or a cryptographically secure RNG.
59.2 Rust Crate Ecosystem for ECDSA
| Crate | Description | Use When |
|---|---|---|
k256 | Pure-Rust secp256k1 (RustCrypto group) | General secp256k1 ECDSA, no C deps, WASM-friendly |
secp256k1 | Bindings to Bitcoin Core's libsecp256k1 (C) | Bitcoin-specific, maximum performance, battle-tested |
ecdsa | Generic ECDSA traits (RustCrypto) | Curve-agnostic code, trait-based API |
p256 | NIST P-256 curve (RustCrypto) | TLS, WebAuthn, FIPS compliance |
elliptic-curve | Core elliptic curve traits | Foundation for k256, p256, etc. |
ethers / alloy | Ethereum development toolkit | Ethereum signing, transactions, EIP-712 |
ring | Crypto primitives (wraps BoringSSL) | ECDSA P-256/P-384, high assurance, no secp256k1 |
59.3 Using k256 — Pure-Rust secp256k1
# Cargo.toml
[dependencies]
k256 = { version = "0.13", features = ["ecdsa", "sha256"] }
rand_core = { version = "0.6", features = ["getrandom"] }
use k256::{
ecdsa::{SigningKey, VerifyingKey, Signature, signature::{Signer, Verifier}},
SecretKey, PublicKey,
elliptic_curve::rand_core::OsRng,
};
// ── Key Generation ──
let signing_key = SigningKey::random(&mut OsRng);
let verifying_key = VerifyingKey::from(&signing_key);
// ── Signing ── (deterministic nonce via RFC 6979)
let message = b"Hello, secp256k1!";
let signature: Signature = signing_key.sign(message);
// sign() hashes with SHA-256 internally (RFC 6979 deterministic nonce)
// ── Verification ──
assert!(verifying_key.verify(message, &signature).is_ok());
// Tampered message fails
assert!(verifying_key.verify(b"wrong", &signature).is_err());
println!("Signature verified successfully!");
59.4 Key Serialization & Formats
use k256::{
ecdsa::SigningKey,
elliptic_curve::sec1::ToEncodedPoint,
SecretKey, PublicKey,
};
let signing_key = SigningKey::random(&mut OsRng);
// ── Private Key Serialization ──
// Raw 32 bytes
let sk_bytes = signing_key.to_bytes();
assert_eq!(sk_bytes.len(), 32);
// Reconstruct from bytes
let recovered = SigningKey::from_bytes(&sk_bytes).unwrap();
// ── Public Key Serialization ──
let pk = PublicKey::from(SecretKey::from(&signing_key));
// Compressed (33 bytes: 1 prefix byte + 32 bytes x-coordinate)
let compressed = pk.to_encoded_point(true);
assert_eq!(compressed.as_bytes().len(), 33);
println!("Compressed: {}", hex::encode(compressed.as_bytes()));
// Uncompressed (65 bytes: 0x04 + 32 bytes x + 32 bytes y)
let uncompressed = pk.to_encoded_point(false);
assert_eq!(uncompressed.as_bytes().len(), 65);
println!("Uncompressed: {}", hex::encode(uncompressed.as_bytes()));
// Recover public key from compressed bytes
let pk2 = PublicKey::from_sec1_bytes(compressed.as_bytes()).unwrap();
assert_eq!(pk, pk2);
// ── Signature Serialization ──
use k256::ecdsa::Signature;
let sig: Signature = signing_key.sign(b"data");
// DER encoding (variable length, ~70-72 bytes)
let der_bytes = sig.to_der();
println!("DER signature: {} bytes", der_bytes.as_bytes().len());
// Fixed-size encoding (64 bytes: r || s, 32 bytes each)
let fixed_bytes = sig.to_bytes();
assert_eq!(fixed_bytes.len(), 64);
// Reconstruct
let sig2 = Signature::from_der(der_bytes.as_bytes()).unwrap();
let sig3 = Signature::from_bytes(&fixed_bytes).unwrap();
59.5 Using secp256k1 — Bitcoin's C Library Bindings
# Cargo.toml
[dependencies]
secp256k1 = { version = "0.29", features = ["rand-std", "global-context"] }
use secp256k1::{Secp256k1, Message, SecretKey, PublicKey};
use secp256k1::rand::rngs::OsRng;
// Create context (pre-computes tables for fast operations)
let secp = Secp256k1::new();
// Generate keypair
let (secret_key, public_key) = secp.generate_keypair(&mut OsRng);
// Sign a pre-hashed message (32 bytes)
use secp256k1::hashes::{Hash, sha256};
let msg_hash = sha256::Hash::hash(b"Hello Bitcoin!");
let message = Message::from_digest(msg_hash.to_byte_array());
let signature = secp.sign_ecdsa(&message, &secret_key);
// Verify
assert!(secp.verify_ecdsa(&message, &signature, &public_key).is_ok());
// Schnorr signatures (BIP-340, Bitcoin Taproot)
use secp256k1::Keypair;
let keypair = Keypair::new(&secp, &mut OsRng);
let schnorr_sig = secp.sign_schnorr_no_aux_rand(&message, &keypair);
let xonly_pk = keypair.x_only_public_key().0;
assert!(secp.verify_schnorr(&schnorr_sig, &message, &xonly_pk).is_ok());
k256 vs secp256k1: k256 is pure Rust (no C deps, WASM-compatible, cleaner API). secp256k1 wraps Bitcoin Core's battle-hardened C library (faster for bulk operations, Schnorr/Taproot support). For Ethereum work, k256 is preferred. For Bitcoin, secp256k1 is standard.
59.6 Ethereum-Style Signatures (Recoverable)
use k256::ecdsa::{SigningKey, VerifyingKey, RecoveryId};
use k256::ecdsa::signature::DigestSigner;
use sha3::{Keccak256, Digest};
// Ethereum uses Keccak-256 (NOT SHA-256) for message hashing
// EIP-191: Ethereum personal_sign format
fn eth_message_hash(message: &[u8]) -> [u8; 32] {
let prefix = format!("\x19Ethereum Signed Message:\n{}", message.len());
let mut hasher = Keccak256::new();
hasher.update(prefix.as_bytes());
hasher.update(message);
hasher.finalize().into()
}
let signing_key = SigningKey::random(&mut OsRng);
let message = b"Hello Ethereum!";
// Hash with Keccak-256
let digest = Keccak256::new_with_prefix(message);
// Sign producing a recoverable signature
let (signature, recovery_id) = signing_key.sign_digest_recoverable(digest).unwrap();
// Recovery ID (v): 0 or 1 — tells which of two possible public keys is correct
// Ethereum encodes this as v = recovery_id + 27 (legacy) or v = recovery_id (EIP-155)
// Recover the public key from just the signature + message
let digest2 = Keccak256::new_with_prefix(message);
let recovered_key = VerifyingKey::recover_from_digest(
digest2,
&signature,
recovery_id,
).unwrap();
assert_eq!(
VerifyingKey::from(&signing_key),
recovered_key
);
// Derive Ethereum address from public key
fn pubkey_to_eth_address(vk: &VerifyingKey) -> [u8; 20] {
use k256::elliptic_curve::sec1::ToEncodedPoint;
let point = vk.to_encoded_point(false); // uncompressed, 65 bytes
let hash = Keccak256::digest(&point.as_bytes()[1..]); // skip 0x04 prefix
let mut addr = [0u8; 20];
addr.copy_from_slice(&hash[12..]); // last 20 bytes of keccak hash
addr
}
let address = pubkey_to_eth_address(&VerifyingKey::from(&signing_key));
println!("ETH Address: 0x{}", hex::encode(address));
59.7 ECDH — Key Exchange
use k256::{SecretKey, PublicKey, ecdh::EphemeralSecret};
use k256::elliptic_curve::rand_core::OsRng;
// ECDH: Two parties derive a shared secret without transmitting it
// Alice generates ephemeral keypair
let alice_secret = EphemeralSecret::random(&mut OsRng);
let alice_public = alice_secret.public_key();
// Bob generates ephemeral keypair
let bob_secret = EphemeralSecret::random(&mut OsRng);
let bob_public = bob_secret.public_key();
// Both compute the same shared secret
let alice_shared = alice_secret.diffie_hellman(&bob_public);
let bob_shared = bob_secret.diffie_hellman(&alice_public);
// alice_shared == bob_shared (same point on curve)
assert_eq!(
alice_shared.raw_secret_bytes(),
bob_shared.raw_secret_bytes()
);
// Derive symmetric key from shared secret (use a KDF!)
use sha2::{Sha256, Digest};
let symmetric_key = Sha256::digest(alice_shared.raw_secret_bytes());
println!("Shared AES key: {}", hex::encode(&symmetric_key));
59.8 HD Key Derivation (BIP-32)
# Cargo.toml
[dependencies]
bip32 = "0.5"
// Hierarchical Deterministic wallets — derive child keys from a master
use bip32::{Mnemonic, XPrv, DerivationPath, Language};
// Generate 24-word mnemonic
let mnemonic = Mnemonic::random(24);
println!("Seed phrase: {}", mnemonic.phrase());
// Derive master key from mnemonic
let seed = mnemonic.to_seed("optional_passphrase");
let master_key = XPrv::new(&seed).unwrap();
// BIP-44 derivation path: m/44'/60'/0'/0/0 (Ethereum)
let path: DerivationPath = "m/44'/60'/0'/0/0".parse().unwrap();
let child_key = XPrv::derive_from_path(&seed, &path).unwrap();
// Get the signing key for transactions
let signing_key = k256::ecdsa::SigningKey::from_bytes(
&child_key.to_bytes()
).unwrap();
59.9 Security Best Practices
| Practice | Why | How in Rust |
|---|---|---|
| Never reuse nonces | Reveals private key from two signatures | Use RFC 6979 (default in k256) or CSPRNG |
| Constant-time operations | Prevents timing side-channel attacks | k256/secp256k1 crates are constant-time by design |
| Zeroize secrets on drop | Prevent secrets lingering in memory | use zeroize::Zeroize; — SigningKey auto-zeroizes |
| Validate public keys | Invalid-curve attacks | PublicKey::from_sec1_bytes() validates point on curve |
| Use canonical signatures | Signature malleability (low-S) | k256 normalizes to low-S by default |
| Hash before signing | ECDSA signs fixed-length digests | Use .sign() which hashes internally, or sign_digest() |
// Zeroize: secrets are automatically cleared from memory
use zeroize::Zeroize;
{
let sk = SigningKey::random(&mut OsRng);
// ... use sk ...
} // sk is dropped here → memory is zeroed automatically
// (k256::SigningKey implements ZeroizeOnDrop)
// Manual zeroize for raw bytes
let mut secret_bytes = [0u8; 32];
// ... fill with secret data ...
secret_bytes.zeroize(); // overwrite with zeros
// Signature malleability — k256 normalizes automatically
// For Bitcoin consensus: both (r, s) and (r, n-s) are valid ECDSA sigs
// k256 always produces "low-S" form to prevent malleability
let sig: Signature = signing_key.sign(message);
let normalized = sig.normalize_s().unwrap_or(sig); // already normalized
59.10 Full Example: Sign & Verify a Message
use k256::ecdsa::{SigningKey, VerifyingKey, Signature};
use k256::ecdsa::signature::{Signer, Verifier};
use k256::elliptic_curve::rand_core::OsRng;
fn main() {
// 1. Generate keys
let sk = SigningKey::random(&mut OsRng);
let vk = VerifyingKey::from(&sk);
// 2. Sign
let message = b"Transfer 1.5 ETH to 0xABCD...";
let signature: Signature = sk.sign(message);
// 3. Serialize for transmission
let sig_hex = hex::encode(signature.to_bytes());
let pk_hex = hex::encode(vk.to_sec1_bytes());
println!("Signature: {sig_hex}");
println!("Public Key: {pk_hex}");
// 4. Deserialize and verify (on receiving end)
let sig_bytes = hex::decode(&sig_hex).unwrap();
let pk_bytes = hex::decode(&pk_hex).unwrap();
let sig2 = Signature::from_slice(&sig_bytes).unwrap();
let vk2 = VerifyingKey::from_sec1_bytes(&pk_bytes).unwrap();
match vk2.verify(message, &sig2) {
Ok(_) => println!("Signature VALID"),
Err(e) => println!("Signature INVALID: {e}"),
}
}
k256 (pure Rust, Keccak-256, recoverable sigs). For Bitcoin → secp256k1 (C bindings, Schnorr/Taproot). For TLS/WebAuthn → p256 or ring (NIST curves). For WASM → k256 (no C deps). For general ECDSA → ecdsa + curve crate.
60. Fuzzing Intermediate
Fuzz testing feeds random or semi-random inputs to your code to discover bugs, panics, and undefined behavior that unit tests miss. Rust's type safety doesn't prevent logic errors, and fuzzing excels at finding panics in parsers, deserializers, and anything that handles untrusted input.
60.1 Fuzzing Tools for Rust
| Tool | Backend | Best For |
|---|---|---|
cargo-fuzz | libFuzzer (LLVM) | Quick setup, coverage-guided, most popular |
afl.rs | AFL++ (American Fuzzy Lop) | Fork-based, parallel fuzzing, great mutation |
bolero | Multiple (libFuzzer, AFL, Kani) | Unified API, property testing + fuzzing |
arbitrary | (helper crate) | Structure-aware fuzzing — derive random structs |
proptest | Property-based testing | Shrinking, deterministic, complementary to fuzzing |
60.2 cargo-fuzz — Getting Started
# Install
cargo install cargo-fuzz
# Initialize fuzzing in your project
cd my-crate
cargo fuzz init
# Creates: fuzz/Cargo.toml and fuzz/fuzz_targets/
# Add a fuzz target
cargo fuzz add my_parser
// fuzz/fuzz_targets/my_parser.rs
#![no_main]
use libfuzzer_sys::fuzz_target;
use my_crate::parse_input;
fuzz_target!(|data: &[u8]| {
// This is called thousands of times per second
// with mutated inputs. Any panic = bug found!
let _ = parse_input(data);
});
// Run the fuzzer (requires nightly)
// cargo +nightly fuzz run my_parser
// Ctrl+C to stop. Crashes saved to fuzz/artifacts/my_parser/
60.3 Structure-Aware Fuzzing with Arbitrary
# Cargo.toml (fuzz target)
[dependencies]
arbitrary = { version = "1", features = ["derive"] }
libfuzzer-sys = "0.11"
use arbitrary::Arbitrary;
use libfuzzer_sys::fuzz_target;
// Instead of raw bytes, generate structured inputs
#[derive(Arbitrary, Debug)]
struct Config {
width: u32,
height: u32,
name: String,
mode: Mode,
}
#[derive(Arbitrary, Debug)]
enum Mode { Fast, Safe, Compat }
fuzz_target!(|config: Config| {
// The fuzzer generates valid Config structs
// and mutates their fields intelligently
let _ = my_crate::process(config);
});
// Also works with enums for protocol fuzzing
#[derive(Arbitrary, Debug)]
enum Command {
Get { key: String },
Set { key: String, value: Vec<u8> },
Delete { key: String },
Flush,
}
fuzz_target!(|commands: Vec<Command>| {
let mut db = my_crate::Database::new();
for cmd in commands {
let _ = db.execute(cmd); // test sequences of operations
}
});
60.4 Differential Fuzzing
// Compare two implementations — they should produce identical output
fuzz_target!(|data: &[u8]| {
let result_v1 = parser_v1::parse(data);
let result_v2 = parser_v2::parse(data);
match (result_v1, result_v2) {
(Ok(a), Ok(b)) => assert_eq!(a, b, "outputs differ"),
(Err(_), Err(_)) => {} // both reject — fine
_ => panic!("one accepted, other rejected"),
}
});
60.5 Reproducing & Managing Crashes
# Reproduce a crash
cargo +nightly fuzz run my_parser fuzz/artifacts/my_parser/crash-abc123
# Minimize a crash input (find smallest reproducer)
cargo +nightly fuzz tmin my_parser fuzz/artifacts/my_parser/crash-abc123
# Build a corpus of interesting inputs
mkdir fuzz/corpus/my_parser
# Add seed inputs (valid examples of your format)
cp test_data/*.bin fuzz/corpus/my_parser/
# Run with corpus (fuzzer learns from these)
cargo +nightly fuzz run my_parser fuzz/corpus/my_parser/
# Run for a fixed duration
cargo +nightly fuzz run my_parser -- -max_total_time=300 # 5 minutes
# Check coverage
cargo +nightly fuzz coverage my_parser
# View with: llvm-cov show ...
&[u8] or &str from untrusted sources. If it processes external input, fuzz it.
61. Async Closures & AsyncFn Traits Intermediate
Rust 1.85 stabilized async closures — closures that can .await inside their body. Before this, the common workaround was returning a future from a regular closure, which was verbose and had lifetime issues. Async closures solve these ergonomic problems.
61.1 The Problem Before Async Closures
// BEFORE: awkward workaround with regular closures returning futures
fn retry<F, Fut, T>(f: F, attempts: u32) -> T
where
F: Fn() -> Fut, // closure returns a future
Fut: Future<Output = T>,
{
// ...
}
// Usage was verbose
retry(|| async { fetch_data().await }, 3);
// And had lifetime issues when borrowing:
let url = String::from("https://example.com");
// ❌ ERROR: closure returns a future that borrows `url`,
// but the Fn() -> Fut bound can't express that lifetime
retry(|| async { fetch(&url).await }, 3);
61.2 Async Closures (Rust 1.85+)
// AFTER: native async closures — clean and correct
let greet = async |name: &str| {
let response = fetch_greeting(name).await;
println!("Hello, {name}: {response}");
};
greet("Rust").await;
// Borrowing works naturally!
let db = Database::connect().await;
let query = async |id: u32| {
db.fetch(id).await // borrows `db` — works correctly
};
let user1 = query(1).await;
let user2 = query(2).await; // can call multiple times
// With move semantics
let client = reqwest::Client::new();
let fetch = async move |url: &str| {
client.get(url).send().await // owns `client`
};
61.3 The AsyncFn Trait Family
// Three new traits mirror the sync closure traits:
// AsyncFn — can be called by shared reference (&self)
// AsyncFnMut — can be called by mutable reference (&mut self)
// AsyncFnOnce — can be called once, consuming self
// Hierarchy (like Fn/FnMut/FnOnce):
// AsyncFn: AsyncFnMut: AsyncFnOnce
// Use in function bounds
async fn retry_with<F, T>(f: F, attempts: u32) -> Option<T>
where
F: AsyncFn() -> Result<T, Error>, // async closure bound
{
for _ in 0..attempts {
if let Ok(val) = f().await {
return Some(val);
}
}
None
}
// impl AsyncFn syntax (like impl Fn)
async fn apply(callback: impl AsyncFn(i32) -> String) {
let result = callback(42).await;
println!("{result}");
}
// Usage
apply(async |x| {
let data = fetch(x).await;
format!("got: {data}")
}).await;
61.4 Async Closures vs Closure-Returning-Future
| Feature | async || { } | || async { } |
|---|---|---|
| Syntax | async |x| { x.await } | |x| async move { x.await } |
| Borrowing | Borrows captured variables correctly | Lifetime issues with borrows |
| Called multiple times | Natural — implements AsyncFn | Each call creates a new future |
| Bound syntax | F: AsyncFn(T) -> U | F: Fn(T) -> Fut, Fut: Future<Output=U> |
| Available since | Rust 1.85 | Always (workaround) |
| Captures | Same as sync closures | Often requires move + Clone |
F: Fn() -> Fut, Fut: Future<Output = T> bounds, you can simplify to F: AsyncFn() -> T. The old pattern still works, but async closures are more ergonomic and handle borrowing correctly.
62. Rustdoc & Documentation Core
Rust has best-in-class documentation tooling built into the language. Doc comments compile into HTML, doc-tests run as part of cargo test, and the ecosystem standard is to document everything public. Good documentation is considered part of the API contract.
62.1 Doc Comment Syntax
/// Line doc comment — documents the NEXT item
/// Supports full **Markdown**: *italic*, `code`, [links](url)
pub fn add(a: i32, b: i32) -> i32 {
a + b
}
/** Block doc comment — also documents the next item.
Rarely used; prefer /// style. */
pub struct Point {
/// The x coordinate
pub x: f64,
/// The y coordinate
pub y: f64,
}
//! Inner doc comment — documents the ENCLOSING item
//! Used at the top of lib.rs or mod.rs for module/crate docs
//!
//! # My Crate
//!
//! This crate provides utilities for ...
62.2 Documentation Sections & Conventions
/// Computes the distance between two points.
///
/// # Arguments
///
/// * `a` - The first point
/// * `b` - The second point
///
/// # Returns
///
/// The Euclidean distance as `f64`.
///
/// # Examples
///
/// ```
/// let a = Point { x: 0.0, y: 0.0 };
/// let b = Point { x: 3.0, y: 4.0 };
/// assert_eq!(distance(&a, &b), 5.0);
/// ```
///
/// # Panics
///
/// Does not panic.
///
/// # Errors
///
/// Returns `Err` if coordinates are NaN.
///
/// # Safety
///
/// (For unsafe functions — describe invariants the caller must uphold)
pub fn distance(a: &Point, b: &Point) -> f64 {
((b.x - a.x).powi(2) + (b.y - a.y).powi(2)).sqrt()
}
# Examples, # Panics, # Errors, # Safety. The # Examples section is the most important — Rust culture strongly encourages examples for every public function.
62.3 Doc-Tests — Examples That Run
/// Parses a string into a Config.
///
/// ```
/// // This code block is compiled and run by `cargo test`!
/// use my_crate::Config;
///
/// let config = Config::parse("key=value").unwrap();
/// assert_eq!(config.get("key"), Some("value"));
/// ```
///
/// Errors are shown with `should_panic`:
///
/// ```should_panic
/// my_crate::Config::parse("").unwrap(); // panics!
/// ```
///
/// Code that should compile but not run:
///
/// ```no_run
/// // Compiles but doesn't execute (e.g., network code)
/// let server = my_crate::Server::bind("0.0.0.0:8080").await;
/// ```
///
/// Code that shouldn't even compile (negative example):
///
/// ```compile_fail
/// let x: u32 = "not a number"; // type error!
/// ```
///
/// Hide boilerplate with `#` (hidden lines):
///
/// ```
/// # use my_crate::Config;
/// # fn main() -> Result<(), Box<dyn std::error::Error>> {
/// let config = Config::parse("key=value")?;
/// assert_eq!(config.get("key"), Some("value"));
/// # Ok(())
/// # }
/// ```
pub fn parse(input: &str) -> Result<Config, ParseError> {
// ...
}
# Run doc-tests alongside unit tests
cargo test
# Run ONLY doc-tests
cargo test --doc
# Run doc-tests for a specific function
cargo test --doc parse
62.4 Intra-Doc Links
/// Creates a new [`Widget`] with the given [`Config`].
///
/// See [`Widget::render`] for how to display it.
/// Uses [`crate::utils::format_output`] internally.
///
/// For the error type, see [`ConfigError`](crate::errors::ConfigError).
///
/// Link to a method: [`Vec::push`]
/// Link to a trait: [`Iterator`]
/// Link to a module: [`std::collections`]
/// Link to an enum variant: [`Option::Some`]
pub fn new(config: Config) -> Widget { /* ... */ }
62.5 Documentation Attributes
// Hide an item from public docs
#[doc(hidden)]
pub fn __internal_helper() {}
// Add an alias for search
#[doc(alias = "serialize")]
pub fn encode() {}
// Custom HTML in crate root (lib.rs)
#![doc = include_str!("../README.md")] // use README as crate docs
// Feature-gated docs
#[cfg_attr(docsrs, doc(cfg(feature = "serde")))]
pub fn serialize() {} // shows "Available on feature serde only" badge
// Deny missing docs (enforce documentation)
#![deny(missing_docs)] // compile error if public items lack docs
#![warn(missing_docs)] // warning instead of error
62.6 Generating & Publishing Docs
# Generate HTML docs
cargo doc
# Generate and open in browser
cargo doc --open
# Include private items
cargo doc --document-private-items
# Generate docs for all dependencies too
cargo doc --no-deps # only your crate (faster, usually what you want)
# docs.rs — auto-published when you `cargo publish`
# Configure in Cargo.toml:
[package.metadata.docs.rs]
all-features = true
rustdoc-args = ["--cfg", "docsrs"]
# Examples for functions. (4) Use #![deny(missing_docs)] in libraries. (5) Doc-tests serve double duty: documentation AND tests. (6) Use intra-doc links — they're checked by the compiler and never go stale.
63. Const Generics Deep Dive Intermediate
Const generics let you parameterize types and functions by compile-time constant values (not just types). This enables type-safe fixed-size arrays, matrices, protocol buffers, and more — catching size mismatches at compile time.
63.1 Basics
// Generic over array size
struct ArrayWrapper<const N: usize> {
data: [f64; N],
}
impl<const N: usize> ArrayWrapper<N> {
fn new() -> Self {
ArrayWrapper { data: [0.0; N] }
}
fn sum(&self) -> f64 {
self.data.iter().sum()
}
}
let small = ArrayWrapper::<3>::new(); // [f64; 3]
let big = ArrayWrapper::<1000>::new(); // [f64; 1000]
// small and big are DIFFERENT types — can't mix them
// Type-safe matrix multiplication
struct Matrix<const ROWS: usize, const COLS: usize> {
data: [[f64; COLS]; ROWS],
}
// Dimensions checked at compile time!
fn multiply<const M: usize, const N: usize, const P: usize>(
a: &Matrix<M, N>,
b: &Matrix<N, P>, // N must match!
) -> Matrix<M, P> {
// ... matrix multiplication ...
todo!()
}
let a: Matrix<2, 3> = /* ... */;
let b: Matrix<3, 4> = /* ... */;
let c = multiply(&a, &b); // Matrix<2, 4> ✅
// let d = multiply(&a, &a); // compile error: Matrix<2,3> × Matrix<2,3> — 3 ≠ 2
63.2 Allowed Const Parameter Types
| Allowed | Not (Yet) Allowed |
|---|---|
usize, u8...u128, i8...i128 | Floats (f32, f64) |
bool | String, &str |
char | Custom structs/enums |
| References, pointers |
63.3 Default Values & Where Clauses
// Default const generic value
struct Buffer<const SIZE: usize = 1024> {
data: [u8; SIZE],
}
let default_buf = Buffer::<> { data: [0; 1024] }; // uses default 1024
let small_buf = Buffer::<64> { data: [0; 64] };
// Const expressions in where clauses (nightly: generic_const_exprs)
// #![feature(generic_const_exprs)]
// fn concat<const A: usize, const B: usize>(
// a: [u8; A], b: [u8; B]
// ) -> [u8; A + B] // ← const expression! (nightly only)
// where [(); A + B]: Sized
// { ... }
// Stable workaround: use associated consts
trait ArraySize {
const SIZE: usize;
}
struct Small;
impl ArraySize for Small { const SIZE: usize = 64; }
struct Large;
impl ArraySize for Large { const SIZE: usize = 4096; }
63.4 Practical Patterns
// Fixed-size ring buffer
struct RingBuffer<T, const N: usize> {
buf: [Option<T>; N],
head: usize,
len: usize,
}
// Type-safe protocol with version number
struct Message<const VERSION: u8> {
payload: Vec<u8>,
}
// Message<1> and Message<2> are incompatible types
// Compile-time assertions
fn must_be_power_of_two<const N: usize>() {
const { assert!(N.is_power_of_two()) }; // compile-time check!
}
must_be_power_of_two::<16>(); // ✅
// must_be_power_of_two::<15>(); // ❌ compile error
// std library examples
let arr: [i32; 5] = [1, 2, 3, 4, 5];
let windows: Vec<&[i32; 3]> = arr.array_windows::<3>().collect();
let chunks: Vec<[i32; 2]> = arr.array_chunks::<2>().copied().collect();
// SIMD-friendly aligned buffer
#[repr(align(32))]
struct AlignedBuffer<const N: usize>([u8; N]);
64. Cargo Features & Conditional Compilation Intermediate
Cargo features are compile-time flags that enable optional functionality, reduce dependency trees, and let users customize what gets compiled. Conditional compilation via #[cfg(...)] lets you write platform-specific or feature-gated code.
64.1 Defining Features
# Cargo.toml
[features]
# Default features (enabled unless user opts out)
default = ["json", "logging"]
# Individual features
json = ["dep:serde_json"] # enables optional dependency
logging = ["dep:tracing"]
yaml = ["dep:serde_yaml"]
full = ["json", "yaml", "logging"] # meta-feature
# Feature that enables a feature in a dependency
tls = ["reqwest/rustls-tls"]
[dependencies]
serde = "1"
serde_json = { version = "1", optional = true } # only if "json" feature
serde_yaml = { version = "0.9", optional = true }
tracing = { version = "0.1", optional = true }
reqwest = { version = "0.12", optional = true }
# User selects features when adding dependency
cargo add my-crate # default features
cargo add my-crate --no-default-features # bare minimum
cargo add my-crate --features json,yaml # specific features
cargo add my-crate --all-features # everything
64.2 Using Features in Code
// #[cfg(feature = "...")] — compile this item only when feature is enabled
#[cfg(feature = "json")]
pub fn to_json<T: serde::Serialize>(val: &T) -> String {
serde_json::to_string(val).unwrap()
}
#[cfg(feature = "yaml")]
pub fn to_yaml<T: serde::Serialize>(val: &T) -> String {
serde_yaml::to_string(val).unwrap()
}
// cfg! macro — evaluates to true/false at compile time
fn init() {
if cfg!(feature = "logging") {
println!("Logging enabled");
}
}
// Conditional imports
#[cfg(feature = "json")]
use serde_json;
// Conditional module inclusion
#[cfg(feature = "json")]
mod json_support;
// Feature-gated trait implementations
#[cfg(feature = "json")]
impl FromJson for Config {
fn from_json(s: &str) -> Self { /* ... */ }
}
64.3 Platform & Target Conditional Compilation
// Operating system
#[cfg(target_os = "linux")]
fn platform_init() { /* Linux-specific */ }
#[cfg(target_os = "macos")]
fn platform_init() { /* macOS-specific */ }
#[cfg(target_os = "windows")]
fn platform_init() { /* Windows-specific */ }
// Architecture
#[cfg(target_arch = "x86_64")]
fn simd_add() { /* use SSE/AVX */ }
#[cfg(target_arch = "aarch64")]
fn simd_add() { /* use NEON */ }
// Combinations with any/all/not
#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
fn linux_x86() {}
#[cfg(any(target_os = "linux", target_os = "macos"))]
fn unix_like() {}
#[cfg(not(target_os = "windows"))]
fn not_windows() {}
// Test/debug conditional
#[cfg(test)]
mod tests { /* only compiled for `cargo test` */ }
#[cfg(debug_assertions)]
fn expensive_check() { /* only in debug builds */ }
64.4 Feature Design Best Practices
dep: prefix for optional dependency features (avoids implicit features). (4) Keep the default feature set minimal. (5) Test with --all-features and --no-default-features in CI.
# CI: test all feature combinations
cargo test --no-default-features
cargo test --all-features
cargo test --features json
cargo test --features yaml
# Or use cargo-hack for exhaustive testing
cargo install cargo-hack
cargo hack test --feature-powerset
65. Procedural Macros Deep Dive Advanced
Procedural macros are Rust plugins that run at compile time, transforming token streams into new code. They enable derive macros, attribute macros, and function-like macros that can generate arbitrary Rust code from custom syntax.
65.1 Architecture
65.2 Project Setup
# Proc macros MUST live in their own crate
# my-macros/Cargo.toml
[lib]
proc-macro = true
[dependencies]
syn = { version = "2", features = ["full"] } # parse Rust syntax
quote = "1" # generate Rust code
proc-macro2 = "1" # token stream utilities
65.3 Derive Macro — Step by Step
// my-macros/src/lib.rs
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput};
/// Derive macro that generates a `describe()` method
#[proc_macro_derive(Describe)]
pub fn derive_describe(input: TokenStream) -> TokenStream {
// 1. Parse the input tokens into a syntax tree
let input = parse_macro_input!(input as DeriveInput);
let name = &input.ident; // struct/enum name
// 2. Extract field names
let fields = match &input.data {
syn::Data::Struct(data) => {
data.fields.iter().map(|f| {
let fname = f.ident.as_ref().unwrap();
let ftype = &f.ty;
quote! { format!(" {}: {}", stringify!(#fname), stringify!(#ftype)) }
}).collect::<Vec<_>>()
}
_ => panic!("Describe only supports structs"),
};
// 3. Generate the implementation
let expanded = quote! {
impl #name {
pub fn describe() -> String {
let fields = vec![#(#fields),*];
format!("struct {} {{\n{}\n}}",
stringify!(#name),
fields.join("\n"))
}
}
};
// 4. Return generated code as TokenStream
TokenStream::from(expanded)
}
// Usage in another crate:
// use my_macros::Describe;
//
// #[derive(Describe)]
// struct User {
// name: String,
// age: u32,
// }
//
// println!("{}", User::describe());
// Output:
// struct User {
// name: String
// age: u32
// }
65.4 Attribute Macro
// Attribute macros transform the item they're attached to
#[proc_macro_attribute]
pub fn log_calls(attr: TokenStream, item: TokenStream) -> TokenStream {
let input = parse_macro_input!(item as syn::ItemFn);
let fn_name = &input.sig.ident;
let fn_body = &input.block;
let fn_sig = &input.sig;
let vis = &input.vis;
let expanded = quote! {
#vis #fn_sig {
println!("[LOG] Entering {}", stringify!(#fn_name));
let __result = { #fn_body };
println!("[LOG] Exiting {}", stringify!(#fn_name));
__result
}
};
TokenStream::from(expanded)
}
// Usage:
// #[log_calls]
// fn process_data(x: i32) -> i32 { x * 2 }
65.5 Function-Like Macro
// Function-like proc macros are invoked like macro_rules! macros
#[proc_macro]
pub fn make_getters(input: TokenStream) -> TokenStream {
// Parse custom syntax: make_getters!(User { name: String, age: u32 })
// Generate getter methods for each field
// ...
input // placeholder
}
// Usage: make_getters!(User { name: String, age: u32 })
65.6 Debugging Proc Macros
# See what your macro generates
cargo expand # expand all macros (requires cargo-expand)
cargo expand --lib # expand lib.rs only
cargo expand my_module::MyStruct # expand specific item
# In the macro itself, use compile_error! for diagnostics
if fields.is_empty() {
return quote! { compile_error!("Describe requires at least one field"); }.into();
}
# Print during compilation (shows in cargo build output)
eprintln!("Processing struct: {}", name);
syn — parses Rust source into an AST. quote — generates Rust code via quasi-quoting (quote! { ... }). proc-macro2 — wrapper for proc_macro that works in unit tests. darling — simplifies parsing derive macro attributes. cargo-expand — view expanded macro output.
66. Benchmarking Intermediate
Benchmarking measures the actual performance of your code. Rust's ecosystem provides excellent tools for statistically rigorous benchmarking, catching regressions, and optimizing hot paths.
66.1 Criterion.rs — The Standard
# Cargo.toml
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
[[bench]]
name = "my_benchmark"
harness = false # use criterion's harness, not built-in
// benches/my_benchmark.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use my_crate::fibonacci;
fn bench_fibonacci(c: &mut Criterion) {
c.bench_function("fib 20", |b| {
b.iter(|| fibonacci(black_box(20)))
// black_box prevents the optimizer from removing the computation
});
}
fn bench_comparison(c: &mut Criterion) {
let mut group = c.benchmark_group("Fibonacci");
for size in [10, 20, 30] {
group.bench_with_input(
criterion::BenchmarkId::new("recursive", size),
&size,
|b, &n| b.iter(|| fib_recursive(black_box(n))),
);
group.bench_with_input(
criterion::BenchmarkId::new("iterative", size),
&size,
|b, &n| b.iter(|| fib_iterative(black_box(n))),
);
}
group.finish();
}
criterion_group!(benches, bench_fibonacci, bench_comparison);
criterion_main!(benches);
# Run benchmarks
cargo bench
# Run specific benchmark
cargo bench -- "fib 20"
# Open HTML report with plots
open target/criterion/report/index.html
# Output shows:
# fib 20 time: [845.12 ns 847.56 ns 850.23 ns]
# change: [-1.2% -0.3% +0.5%] (p = 0.42 > 0.05)
# No change in performance detected.
66.2 Benchmarking Async Code
# Cargo.toml
[dev-dependencies]
criterion = { version = "0.5", features = ["async_tokio"] }
tokio = { version = "1", features = ["full"] }
use criterion::{criterion_group, criterion_main, Criterion};
use criterion::async_executor::FuturesExecutor;
fn bench_async(c: &mut Criterion) {
c.bench_function("async fetch", |b| {
b.to_async(tokio::runtime::Runtime::new().unwrap())
.iter(|| async {
my_async_function().await
});
});
}
66.3 Avoiding Pitfalls
| Pitfall | Problem | Fix |
|---|---|---|
| Dead code elimination | Compiler removes the computation entirely | Use black_box() on inputs AND outputs |
| Constant folding | Compiler pre-computes the result | Pass inputs via black_box() |
| Caching effects | First iteration is slow, rest are fast | Criterion warms up automatically |
| Measuring allocation only | Benchmark dominated by Vec::new() | Pre-allocate outside the measured block |
| Noise from other processes | Inconsistent results | Close other apps, use --sample-size 100 |
// BAD: optimizer might remove the entire computation
b.iter(|| fibonacci(20));
// GOOD: black_box prevents optimization
b.iter(|| black_box(fibonacci(black_box(20))));
// Pre-allocate to benchmark only the algorithm
let data: Vec<i32> = (0..10000).collect();
b.iter(|| {
let mut sorted = data.clone(); // clone is part of measurement
sorted.sort();
black_box(sorted)
});
divan — simpler criterion alternative with attribute macros (#[divan::bench]). iai-callgrind — instruction-count benchmarks (deterministic, CI-friendly, no noise). Built-in #[bench] — nightly only, basic, no statistics.
67. Borrow Checker vs NLL vs Polonius Advanced
What Does the Borrow Checker Actually Check?
A common misconception is that the borrow checker "only checks lifetimes." In reality, lifetimes are just one piece of a much larger set of rules the borrow checker enforces. The borrow checker is responsible for all of the following:
| # | Rule | What It Prevents |
|---|---|---|
| 1 | Ownership & Move Semantics — each value has exactly one owner; when ownership is transferred (moved), the old binding becomes invalid | Use-after-move bugs, double-free |
| 2 | Borrowing Exclusivity — at any point, you can have either ONE &mut T OR any number of &T, never both simultaneously |
Data races, iterator invalidation, aliased mutation |
| 3 | Lifetime Validity — every reference must be valid (not dangling) for its entire lifetime; a reference cannot outlive the data it points to | Dangling pointers, use-after-free |
| 4 | Lifetime Relationships — when functions accept or return references, the compiler checks that output lifetimes are properly constrained by input lifetimes | Returning references to local variables, lifetime mismatch bugs |
| 5 | Mutable Access Paths — you cannot mutate a value while any part of it is borrowed, even through different fields in some cases | Partial mutation during active borrows |
| 6 | Reborrowing Rules — when you pass &mut to a function, the original mutable reference is temporarily "reborrowed" and cannot be used until the reborrow ends |
Aliased mutable references |
// ── Rule 1: Ownership & Move ──
let s1 = String::from("hello");
let s2 = s1; // s1 is MOVED
// println!("{s1}"); // ❌ borrow of moved value
// ── Rule 2: Borrowing Exclusivity ──
let mut v = vec![1, 2];
let r = &v[0]; // immutable borrow
// v.push(3); // ❌ can't mutably borrow while immutable ref exists
println!("{r}"); // last use of r
v.push(3); // ✅ OK now (NLL: borrow ended at last use)
// ── Rule 3: Lifetime Validity (no dangling refs) ──
// fn dangling() -> &String {
// let s = String::from("hi");
// &s // ❌ s is dropped → dangling reference!
// }
// ── Rule 4: Lifetime Relationships ──
fn longest<'a>(a: &'a str, b: &'a str) -> &'a str {
if a.len() > b.len() { a } else { b }
}
// Compiler ensures the returned ref doesn't outlive EITHER input
// ── Rule 5: Mutable Access Paths ──
let mut pair = (1, 2);
let r = &pair.0; // borrow of field .0
// pair.0 = 10; // ❌ can't mutate pair.0 while borrowed
// NOTE: pair.1 = 20 is actually OK — the compiler tracks disjoint fields
println!("{r}");
// ── Rule 6: Reborrowing ──
fn modify(v: &mut Vec<i32>) { v.push(42); }
let mut data = vec![1];
let r = &mut data;
modify(r); // r is reborrowed into the function
r.push(99); // ✅ OK — reborrow has ended, r is usable again
'a, but the other rules (especially exclusivity and move checking) are equally important and happen without any annotations.
The Evolution at a Glance
1. The Original (Lexical) Borrow Checker — Pre-2018
The original borrow checker determined the lifetime of a reference from the lexical scope (the curly-brace block) in which it was created. A borrow lived from the let binding until the closing } — even if the reference was never used again. This was simple to implement but overly conservative.
// ❌ ERROR with the lexical borrow checker (pre-2018)
fn main() {
let mut data = vec![1, 2, 3];
let first = &data[0]; // immutable borrow starts here
println!("{}", first); // last use of `first`
data.push(4); // ❌ ERROR: `data` still borrowed!
} // borrow of `first` ends HERE at `}`
The immutable borrow first was done being used after println!, yet the compiler kept it alive until the end of the block. Developers were forced to use ugly workarounds like extra { } blocks to artificially shorten scopes:
// ✅ WORKAROUND: extra block to limit borrow scope
fn main() {
let mut data = vec![1, 2, 3];
{
let first = &data[0]; // borrow scoped to inner block
println!("{}", first);
} // borrow ends here
data.push(4); // ✅ OK now
}
2. Non-Lexical Lifetimes (NLL) — Rust 2018, Current Default
NLL, stabilised in Rust 1.31 (December 2018), was the single biggest ergonomics win in Rust's history. Instead of tying borrows to scopes, the compiler builds a control-flow graph (CFG) and computes the liveness region for every borrow — the set of program points where the reference might still be used.
// ✅ OK with NLL (Rust 2018+)
fn main() {
let mut data = vec![1, 2, 3];
let first = &data[0]; // immutable borrow starts
println!("{}", first); // last use → borrow ENDS here
data.push(4); // ✅ OK — no active borrow
}
Where NLL still struggles: NLL analyses borrows in a location-insensitive way for the origin (provenance) of references. It knows when a borrow is used but not always which specific data is borrowed through a given reference at each point. This leads to false rejections in conditional borrowing patterns:
// ❌ ERROR with NLL — conditional borrow + mutate pattern
fn get_or_insert(map: &mut HashMap<String, String>, key: &str) -> &String {
if let Some(v) = map.get(key) { // immutable borrow of `map`
return v; // returning the borrow
}
// NLL thinks `map` might still be immutably borrowed here,
// because the borrow *could* have flowed to the return value.
map.insert(key.to_owned(), "default".into()); // ❌ mutable borrow conflict
map.get(key).unwrap()
}
// The problem: NLL cannot see that the `return v` branch and
// the `map.insert` branch are MUTUALLY EXCLUSIVE.
3. Polonius — Next-Generation (Nightly/Experimental)
Polonius is the next-generation borrow checker being developed to replace NLL's internal engine. Named after Shakespeare's character from Hamlet, it flips the analysis model on its head.
This origin-sensitive, flow-sensitive analysis means Polonius can reason about conditional branches precisely:
// ✅ OK with Polonius — it tracks that the loan doesn't flow here
fn get_or_insert(map: &mut HashMap<String, String>, key: &str) -> &String {
if let Some(v) = map.get(key) {
return v; // loan flows to return → branch exits
}
// Polonius knows: if we reach here, the loan from `map.get()`
// did NOT flow to any live reference. It's dead.
map.insert(key.to_owned(), "default".into()); // ✅ no conflict
map.get(key).unwrap()
}
How Polonius Works Internally
Polonius uses Datalog-style rules to compute three key relations over the control-flow graph:
| Relation | Meaning |
|---|---|
origin_contains_loan_at | At program point P, origin O may contain loan L |
loan_invalidated_at | At point P, an action invalidates (conflicts with) loan L |
errors | At point P, loan L is both in some live origin AND invalidated → compile error |
By intersecting "which loans are reachable" with "which loans are invalidated" at each point, Polonius achieves precise, flow-sensitive, origin-sensitive analysis.
Side-by-Side Comparison
| Feature | Lexical Borrow Checker | NLL | Polonius |
|---|---|---|---|
| Era | Rust 1.0 – 1.30 | Rust 1.31+ (2018 edition) | Nightly (experimental) |
| Borrow ends at | End of lexical scope } |
Last use of the reference | Last use (with smarter flow analysis) |
| Analysis model | Scope-based | Liveness-based (CFG) | Origin/provenance-based (Datalog) |
| Flow sensitivity | None (lexical) | Partial — lifetimes are flow-sensitive, origins are not | Full — both lifetimes and origins are flow-sensitive |
| Conditional borrows | Poor — needs workarounds | Better — struggles with conditional return patterns | Excellent — tracks which branch a loan flows through |
| Soundness | Sound (overly conservative) | Sound (less conservative) | Sound (least conservative) |
| False positives | Many | Some | Very few |
Practical Patterns Affected
// Pattern 1: Use-then-mutate — OK since NLL
let mut v = vec![1, 2, 3];
let last = v.last(); // immutable borrow
println!("{:?}", last); // last use
v.push(4); // ✅ NLL ends the borrow above
// Pattern 2: Conditional borrow & mutate — needs Polonius
fn update_or_insert(map: &mut HashMap<i32, Vec<i32>>, key: i32, val: i32) {
if let Some(vec) = map.get_mut(&key) {
vec.push(val); // mutably borrow map → get inner vec
return;
}
map.insert(key, vec![val]); // NLL rejects; Polonius accepts
}
Entry API: map.entry(key).or_insert_with(|| vec![val]). The Entry API was designed precisely to work around these borrow checker limitations by combining the lookup and insert into a single operation.
Timeline
Try Polonius Today
# Install nightly
rustup toolchain install nightly
# Compile with Polonius enabled
RUSTFLAGS="-Z polonius" cargo +nightly check
# Or set it in .cargo/config.toml
[build]
rustflags = ["-Z", "polonius"]
68. Drop Order & Drop Guarantees Advanced
The Drop Order Rules
Rust's drop order is deterministic but follows different rules depending on context:
| Context | Drop Order | Why |
|---|---|---|
| Local variables | Reverse declaration order (last declared = first dropped) | Like a stack — LIFO. Ensures later variables (which may borrow earlier ones) are dropped first. |
| Struct fields | Declaration order (first field = first dropped) | Fields drop top-to-bottom as declared in the struct definition. |
| Tuple fields | Declaration order (index 0 first, then 1, etc.) | Same as structs — left to right. |
| Enum variants | Fields of the active variant drop in declaration order | Only the active variant's fields exist. |
| Closure captures | Order they appear in the closure body (since Edition 2021) | Pre-2021: all captures dropped at once. 2021+: individual drop order matters. |
| Temporaries | End of the statement (usually), unless lifetime-extended | Temporaries created in let x = &temp(); are extended to match x's scope. |
// ── Local variables: reverse declaration order ──
struct Noisy(&'static str);
impl Drop for Noisy {
fn drop(&mut self) { println!("dropping {}", self.0); }
}
fn main() {
let a = Noisy("A"); // declared first
let b = Noisy("B"); // declared second
let c = Noisy("C"); // declared third
}
// Output: dropping C, dropping B, dropping A (reverse order!)
// ── Struct fields: declaration order ──
struct MyStruct {
first: Noisy, // dropped 1st
second: Noisy, // dropped 2nd
third: Noisy, // dropped 3rd
}
// Output: dropping first, dropping second, dropping third
Why This Matters: Lock Ordering
// ⚠️ Lock order depends on variable declaration order!
let guard_a = mutex_a.lock().unwrap(); // locked first
let guard_b = mutex_b.lock().unwrap(); // locked second
// guard_b drops first (reverse order) → mutex_b unlocked first
// guard_a drops second → mutex_a unlocked second
// This is safe — reverse unlock order prevents deadlocks
ManuallyDrop & mem::forget
ManuallyDrop<T> wraps a value and prevents automatic drop. You must manually call ManuallyDrop::drop() or extract the value with ManuallyDrop::into_inner(). This is useful for FFI, unions, and custom smart pointers.
mem::forget(value) consumes a value without running its destructor. It's a safe function — Rust's safety guarantees don't depend on destructors running. Leaking memory is not undefined behavior; it's just a resource leak. This is why Rc cycles can leak — Drop is not guaranteed.
use std::mem::{ManuallyDrop, forget};
// ManuallyDrop — explicit control
let mut md = ManuallyDrop::new(String::from("hello"));
// value is NOT dropped when md goes out of scope
// must manually drop:
unsafe { ManuallyDrop::drop(&mut md); }
// mem::forget — consume without dropping (safe!)
let s = String::from("leaked");
forget(s); // String's heap memory is leaked, destructor never runs
// std::mem::drop — explicitly drop EARLY (also safe)
let guard = mutex.lock().unwrap();
// ... use guard ...
drop(guard); // unlock NOW instead of waiting for scope end
// ... do things that don't need the lock ...
Drop will run. mem::forget, Rc cycles, and ManuallyDrop can all prevent destructors from executing. This is why Rust's safety model never relies on Drop for soundness — only for resource cleanup."
69. Rust Compilation Pipeline Advanced
The Pipeline
Key Stages Explained
HIR (High-level IR): The AST after all syntactic sugar is removed. for x in iter becomes loop { match iter.next() { ... } }. The ? operator becomes a match with early return. Type inference and trait resolution happen at this level.
MIR (Mid-level IR): A control-flow graph where each node is a basic block of simple statements. This is where the borrow checker runs — NLL (and future Polonius) analyses happen on MIR because it has explicit control flow. MIR is also where const evaluation and some optimizations occur before LLVM.
Monomorphization: When you write fn foo<T>(x: T), the compiler generates a separate copy of foo for every concrete T used (e.g., foo_i32, foo_String). This is why generics are "zero-cost" at runtime but can cause code bloat and longer compile times.
// This generic function...
fn add<T: std::ops::Add<Output=T>>(a: T, b: T) -> T { a + b }
// ...becomes MULTIPLE functions after monomorphization:
// fn add_i32(a: i32, b: i32) -> i32 { a + b }
// fn add_f64(a: f64, b: f64) -> f64 { a + b }
// fn add_u8(a: u8, b: u8) -> u8 { a + b }
// Each is fully optimized for its type — zero runtime dispatch
Why Compile Times Are Slow
| Cause | Explanation |
|---|---|
| Monomorphization | Generics multiply code — Vec<T> used with 20 types → 20 copies of every method |
| LLVM optimization | LLVM's optimizer is powerful but slow; most compile time is spent here |
| Borrow checking | MIR-level analysis adds time, especially with complex lifetimes |
| Proc macros | Each proc macro runs a separate compilation of the macro crate |
| Linking | Final linking step; mold or lld linkers can dramatically speed this up |
Speeding Up Compilation
# .cargo/config.toml — common speed-ups
[build]
rustflags = ["-C", "linker=clang", "-C", "link-arg=-fuse-ld=mold"]
[profile.dev]
opt-level = 0 # no optimization for dev builds
debug = 2 # full debug info
[profile.dev.package."*"]
opt-level = 2 # optimize dependencies but not your code
# Use cargo-nextest for faster test execution
# Use sccache for caching compilation across projects
70. Sized, ?Sized & Dynamically Sized Types Advanced
Sized is essential to grasping why str is not the same as String, why you can't pass dyn Trait by value, and how fat pointers work.
What Is Sized?
Sized is an auto-trait (marker trait) that the compiler implements for every type whose size is known at compile time. Almost all types are Sized. By default, every generic parameter has an implicit T: Sized bound.
// These are equivalent — Sized is implicit:
fn foo<T>(x: T) {}
fn foo<T: Sized>(x: T) {} // same thing — Sized is always implied
// To opt OUT of Sized, use ?Sized:
fn bar<T: ?Sized>(x: &T) {} // T might not be Sized (e.g., str, [u8], dyn Trait)
Dynamically Sized Types (DSTs)
A DST is a type whose size is not known at compile time. You can never have a DST on the stack directly — you must always access it behind a pointer (&T, Box<T>, Rc<T>).
| DST | What It Is | How You Use It |
|---|---|---|
str | A UTF-8 string of unknown length | &str, Box<str> |
[T] | A slice of unknown length | &[T], Box<[T]> |
dyn Trait | A trait object of unknown concrete type | &dyn Trait, Box<dyn Trait> |
Fat Pointers: How DSTs Work Behind a Reference
A reference to a DST is a fat pointer — two machine words instead of one:
use std::mem::size_of;
// Thin pointers (Sized types) — 8 bytes on 64-bit
assert_eq!(size_of::<&i32>(), 8);
assert_eq!(size_of::<&String>(), 8);
assert_eq!(size_of::<&Vec<u8>>(), 8);
// Fat pointers (DSTs) — 16 bytes on 64-bit
assert_eq!(size_of::<&str>(), 16); // ptr + length
assert_eq!(size_of::<&[u8]>(), 16); // ptr + length
assert_eq!(size_of::<&dyn ToString>(), 16); // ptr + vtable ptr
Writing ?Sized-Generic Code
// This only accepts Sized types:
fn print_sized<T: std::fmt::Display>(x: &T) {
println!("{x}");
}
// print_sized::("hello"); // ❌ str is not Sized
// This accepts BOTH Sized and unsized types:
fn print_any<T: std::fmt::Display + ?Sized>(x: &T) {
println!("{x}");
}
print_any::<str>("hello"); // ✅ works with DST
print_any(&42); // ✅ works with Sized too
Sized is implicitly bound on every generic. ?Sized opts out, allowing DSTs like str, [T], and dyn Trait. DSTs always live behind a fat pointer (data ptr + length or vtable). This is why &str is 16 bytes and &i32 is 8 bytes."
71. Interior Mutability Deep Dive Advanced
&T). This "breaks" the borrow checker's rule at the type level by moving the check to runtime or using atomic operations. Understanding the entire family tree is essential.
The Primitive: UnsafeCell<T>
UnsafeCell<T> is the only legal way to obtain a mutable pointer from a shared reference in Rust. Every interior-mutability type is built on top of it. The compiler treats UnsafeCell specially — it disables the "immutable through shared reference" optimization.
// UnsafeCell is the foundation — all others build on this
use std::cell::UnsafeCell;
let uc = UnsafeCell::new(42);
// Only way to get mutable access through &UnsafeCell:
let ptr: *mut i32 = uc.get();
unsafe { *ptr = 99; } // you manage the safety invariants
The Complete Interior Mutability Family
| Type | Check | Thread-Safe? | Use Case |
|---|---|---|---|
Cell<T> |
None (copy/replace only) | ❌ No (!Sync) |
Simple values, counters, flags. T must be Copy for .get() |
RefCell<T> |
Runtime borrow check | ❌ No (!Sync) |
Complex values needing &mut access; panics on double-mut-borrow |
OnceCell<T> |
Write-once check | ❌ No (!Sync) |
Lazy init, write once then read forever (single-thread) |
Mutex<T> |
OS-level lock | ✅ Yes | Thread-safe mutable access; blocks on contention |
RwLock<T> |
OS-level read-write lock | ✅ Yes | Multiple readers OR one writer; good for read-heavy workloads |
OnceLock<T> |
Thread-safe write-once | ✅ Yes | Lazy init across threads (like OnceCell but Sync) |
LazyLock<T> |
Thread-safe lazy init | ✅ Yes | Computed on first access; replaces lazy_static! |
Atomic* |
Hardware atomic ops | ✅ Yes | Lock-free counters, flags, pointers. Fastest for simple types. |
use std::cell::{Cell, RefCell, OnceCell};
use std::sync::{Mutex, OnceLock, LazyLock};
// ── Cell: simple set/get for Copy types ──
let counter = Cell::new(0);
counter.set(counter.get() + 1); // mutate through &Cell
// ── RefCell: runtime-checked &mut borrows ──
let data = RefCell::new(vec![1, 2]);
data.borrow_mut().push(3); // runtime borrow check
// data.borrow_mut() + data.borrow() simultaneously → PANIC!
// ── OnceCell: write once, read many (single-thread) ──
let cell = OnceCell::new();
cell.set("initialized".to_string()).unwrap();
// cell.set(...) again → Err (already set)
// ── OnceLock: thread-safe write-once ──
static CONFIG: OnceLock<String> = OnceLock::new();
let val = CONFIG.get_or_init(|| "production".to_string());
// ── LazyLock: thread-safe lazy init (replaces lazy_static!) ──
static REGEX: LazyLock<regex::Regex> = LazyLock::new(|| {
regex::Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap()
});
Decision Tree
72. Deref Coercion & Conversion Traits Advanced
AsRef<str> vs Into<String> vs &str?" is a very common Rust API design question. Understanding the coercion and conversion ecosystem is essential.
Deref Coercion: Automatic Type Conversion
When you have &T but need &U, Rust will automatically dereference if T: Deref<Target = U>. This happens implicitly at call sites — it's why &String works where &str is expected.
// The Deref coercion chain:
// String → str (String: Deref<Target = str>)
// Vec<T> → [T] (Vec<T>: Deref<Target = [T]>)
// Box<T> → T (Box<T>: Deref<Target = T>)
// Arc<T> → T (Arc<T>: Deref<Target = T>)
// Rc<T> → T (Rc<T>: Deref<Target = T>)
// Cow<T> → T (Cow<'_, T>: Deref<Target = T>)
fn takes_str(s: &str) { println!("{s}"); }
let owned = String::from("hello");
takes_str(&owned); // ✅ &String → &str via Deref coercion
let boxed: Box<String> = Box::new("world".into());
takes_str(&boxed); // ✅ &Box<String> → &String → &str (chained!)
let v = vec![1, 2, 3];
fn takes_slice(s: &[i32]) {}
takes_slice(&v); // ✅ &Vec<i32> → &[i32] via Deref
Deref Coercion Rules
| From | To | Condition |
|---|---|---|
&T | &U | T: Deref<Target = U> |
&mut T | &mut U | T: DerefMut<Target = U> |
&mut T | &U | T: Deref<Target = U> (mut → immutable is always OK) |
&T | &mut U | NEVER — can't go from shared to exclusive |
The Conversion Trait Zoo
Rust has several conversion traits that serve different purposes:
| Trait | Method | Cost | When to Use |
|---|---|---|---|
AsRef<U> |
.as_ref() → &U |
Free (just a reference cast) | Accept anything that can cheaply "view as" &U. Use for read-only access. |
AsMut<U> |
.as_mut() → &mut U |
Free | Same as AsRef but mutable. |
Borrow<U> |
.borrow() → &U |
Free | Like AsRef but with a contract: Hash, Eq, Ord must be consistent between T and U. Used by HashMap/BTreeMap for key lookups. |
From<T> |
U::from(t) → U |
May allocate | Type conversion. Infallible. Implementing From<T> auto-provides Into<U>. |
Into<U> |
t.into() → U |
May allocate | Used in generic bounds: fn foo(s: impl Into<String>). Don't implement directly — implement From. |
TryFrom<T> |
U::try_from(t) → Result<U, E> |
May allocate | Fallible conversion. E.g., u8::try_from(256_i32) → Err. |
API Design Patterns
// ── Pattern 1: Accept &str or &String with AsRef ──
fn log_message(msg: impl AsRef<str>) {
println!("LOG: {}", msg.as_ref());
}
log_message("literal"); // &str
log_message(String::from("owned")); // String
log_message(&String::from("ref")); // &String
// ── Pattern 2: Accept &str or String with Into (takes ownership) ──
fn set_name(name: impl Into<String>) {
let _name: String = name.into(); // zero-cost if already String
}
set_name("literal"); // &str → allocates
set_name(String::from("owned")); // String → no allocation!
// ── Pattern 3: HashMap lookup with Borrow ──
use std::collections::HashMap;
let mut map = HashMap::new();
map.insert(String::from("key"), 42);
// Can look up with &str even though keys are String
// because String: Borrow<str>
let val = map.get("key"); // ✅ no allocation needed for lookup!
impl AsRef<str>. Taking ownership? Use impl Into<String>. HashMap keys? That works because of Borrow. Deref coercion handles the rest automatically.
73. Type Erasure Patterns Advanced
Approach 1: Trait Objects (dyn Trait)
The most common form of type erasure. The concrete type is erased at compile time, replaced by a vtable pointer for dynamic dispatch.
trait Animal {
fn speak(&self) -> &str;
}
struct Dog;
struct Cat;
impl Animal for Dog { fn speak(&self) -> &str { "Woof" } }
impl Animal for Cat { fn speak(&self) -> &str { "Meow" } }
// Heterogeneous collection — different concrete types, one Vec
let animals: Vec<Box<dyn Animal>> = vec![
Box::new(Dog),
Box::new(Cat),
];
for a in &animals {
println!("{}", a.speak()); // dynamic dispatch via vtable
}
Approach 2: Downcasting with Any
std::any::Any lets you store any 'static type and later downcast to recover the concrete type. This is Rust's "runtime type info" — but opt-in and safe.
use std::any::Any;
fn print_if_string(value: &dyn Any) {
if let Some(s) = value.downcast_ref::<String>() {
println!("It's a String: {s}");
} else if let Some(n) = value.downcast_ref::<i32>() {
println!("It's an i32: {n}");
} else {
println!("Unknown type (TypeId: {:?})", value.type_id());
}
}
print_if_string(&String::from("hello")); // It's a String: hello
print_if_string(&42_i32); // It's an i32: 42
Approach 3: Enum-Based Type Erasure
When you have a known, fixed set of types, an enum is more efficient (no heap allocation, no vtable). This is common for message/event systems.
enum Value {
Int(i64),
Float(f64),
Text(String),
Bool(bool),
List(Vec<Value>), // recursive!
}
// No heap allocation for the dispatch, no vtable
fn display(v: &Value) {
match v {
Value::Int(n) => println!("int: {n}"),
Value::Float(f) => println!("float: {f}"),
Value::Text(s) => println!("text: {s}"),
Value::Bool(b) => println!("bool: {b}"),
Value::List(l) => println!("list of {} items", l.len()),
}
}
Approach 4: Closure-Based Erasure
Closures capture their environment, and Box<dyn Fn()> erases the closure's unique type. This is how callback systems and event handlers work.
type Callback = Box<dyn Fn(i32) -> i32>;
fn make_adder(n: i32) -> Callback {
Box::new(move |x| x + n) // closure type is erased behind dyn Fn
}
let callbacks: Vec<Callback> = vec![
make_adder(1),
make_adder(100),
Box::new(|x| x * 2),
];
// Each closure has a different anonymous type, but all are dyn Fn(i32) -> i32
dyn Trait (trait objects). Need to recover the original type? → dyn Any + downcast. Callbacks? → Box<dyn Fn(...)>.
74. Tower / Service Trait Pattern Advanced
Service trait. Understanding it is essential for backend Rust interviews.
The Service Trait
At its core, Tower defines a single trait that represents "something that handles requests and produces responses asynchronously":
// Simplified Tower Service trait
pub trait Service<Request> {
type Response;
type Error;
type Future: Future<Output = Result<Self::Response, Self::Error>>;
/// Check if the service is ready to accept a request
fn poll_ready(&mut self, cx: &mut Context) -> Poll<Result<(), Self::Error>>;
/// Process a request and return a future
fn call(&mut self, req: Request) -> Self::Future;
}
// Key insight: the trait takes &mut self, enabling backpressure
// poll_ready() lets a service say "I'm overwhelmed, wait"
The Middleware / Layer Pattern
Tower's power comes from composable middleware. A Layer wraps a service to add behaviour (logging, timeouts, retries, auth) without modifying the inner service:
// A Layer wraps one Service to produce another Service
pub trait Layer<S> {
type Service;
fn layer(&self, inner: S) -> Self::Service;
}
// Example: building a middleware stack
use tower::{ServiceBuilder, timeout::TimeoutLayer, limit::RateLimitLayer};
use std::time::Duration;
let service = ServiceBuilder::new()
.layer(TimeoutLayer::new(Duration::from_secs(30)))
.layer(RateLimitLayer::new(100, Duration::from_secs(1)))
// .layer(AuthLayer::new(auth_config))
// .layer(TracingLayer::new())
.service(my_handler);
// Request flows: → Rate Limit → Timeout → Handler
// Response flows back in reverse through each layer
Axum Uses Tower Natively
use axum::{Router, routing::get, middleware};
use tower_http::trace::TraceLayer;
use tower_http::cors::CorsLayer;
let app = Router::new()
.route("/", get(handler))
.layer(TraceLayer::new_for_http()) // Tower middleware
.layer(CorsLayer::permissive()); // Tower middleware
// Every Axum handler IS a Tower Service under the hood
// Every middleware IS a Tower Layer
Writing Your Own Middleware
use tower::{Layer, Service};
use std::task::{Context, Poll};
use std::future::Future;
use std::pin::Pin;
// 1. Define the Layer (factory)
struct LogLayer;
impl<S> Layer<S> for LogLayer {
type Service = LogService<S>;
fn layer(&self, inner: S) -> Self::Service {
LogService { inner }
}
}
// 2. Define the Service (wrapper)
struct LogService<S> { inner: S }
impl<S, Req> Service<Req> for LogService<S>
where
S: Service<Req>,
Req: std::fmt::Debug,
{
type Response = S::Response;
type Error = S::Error;
type Future = S::Future;
fn poll_ready(&mut self, cx: &mut Context) -> Poll<Result<(), Self::Error>> {
self.inner.poll_ready(cx)
}
fn call(&mut self, req: Req) -> Self::Future {
println!("Request: {:?}", req);
self.inner.call(req)
}
}
Service trait is to async Rust what Iterator is to synchronous Rust — a composable abstraction that the entire ecosystem builds upon. The poll_ready method enables backpressure, and the Layer trait enables middleware composition.
75. Workspace & Multi-Crate Architecture Intermediate
Workspace Basics
A Cargo workspace is a set of crates that share a single Cargo.lock, output directory, and can depend on each other with path dependencies.
# Root Cargo.toml
[workspace]
resolver = "2"
members = [
"crates/core",
"crates/api",
"crates/cli",
"crates/shared",
]
# Shared dependency versions across workspace
[workspace.dependencies]
serde = { version = "1", features = ["derive"] }
tokio = { version = "1", features = ["full"] }
anyhow = "1"
Common Architecture Patterns
When to Split Into Crates
| Signal | Action |
|---|---|
| Compile time too slow | Split into crates — each crate compiles independently & in parallel |
| Multiple binaries share logic | Extract shared logic into a library crate |
| Want to publish parts separately | Separate crate per publishable unit |
| Different feature flag requirements | Separate crates to avoid feature unification issues |
| Clean dependency boundaries | Core crate with no I/O deps; adapters in separate crates |
Workspace Dependencies
# In a member crate's Cargo.toml:
[dependencies]
# Use workspace version (inherits version + features from root)
serde = { workspace = true }
tokio = { workspace = true }
# Path dependency on sibling crate
core = { path = "../core" }
# This ensures all crates use the SAME version of shared deps
# No more "which version of serde are we using?" confusion
Feature Flags Across Workspaces
# Root Cargo.toml — shared feature definitions
[workspace.dependencies]
my-core = { path = "crates/core", default-features = false }
# crates/core/Cargo.toml
[features]
default = ["std"]
std = []
# no_std support by disabling default features
# crates/api/Cargo.toml
[dependencies]
my-core = { workspace = true, features = ["std"] }
# ⚠️ Feature unification: if ANY crate enables a feature,
# it's enabled for ALL crates in the workspace that depend on it.
# This is why separating crates helps avoid unwanted features.
76. Lifetime Elision Deep Dive Advanced
What Is Lifetime Elision?
Lifetime elision is a set of deterministic rules the compiler applies to function signatures to infer lifetime parameters automatically. These are not heuristics or special cases — they're a fixed algorithm. If the rules produce a unique, unambiguous answer, no annotations are needed. If they don't, you must annotate explicitly.
The Three Elision Rules (in order)
Rule 1 (Input Lifetimes): Each reference parameter gets its own distinct lifetime. This always runs first.
// What you write: // What the compiler sees:
fn foo(x: &str) fn foo<'a>(x: &'a str)
fn bar(x: &str, y: &str) fn bar<'a, 'b>(x: &'a str, y: &'b str)
fn baz(x: &str, y: &str, z: &str) fn baz<'a, 'b, 'c>(x: &'a str, y: &'b str, z: &'c str)
Rule 2 (Single Input → Output): If there is exactly one input lifetime, it is assigned to all output references.
// What you write: // What the compiler sees:
fn first_word(s: &str) -> &str fn first_word<'a>(s: &'a str) -> &'a str
fn trim(s: &str) -> &str fn trim<'a>(s: &'a str) -> &'a str
fn as_bytes(s: &str) -> &[u8] fn as_bytes<'a>(s: &'a str) -> &'a [u8]
// ❌ Rule 2 does NOT apply with two inputs:
// fn longest(a: &str, b: &str) -> &str
// Two input lifetimes ('a, 'b) — which one for the output?
// Compiler: "I can't decide" → you must annotate!
Rule 3 (Method &self → Output): If one of the inputs is &self or &mut self, its lifetime is assigned to all output references. This rule only applies to methods (inside impl blocks).
struct Parser<'src> { source: &'src str }
impl<'src> Parser<'src> {
// What you write:
fn next_token(&self) -> &str { self.source }
// What the compiler sees (Rule 3 applies):
// fn next_token<'a>(&'a self) -> &'a str
// Output borrows from self, not from 'src!
// This is actually correct here because self holds the source.
// ⚠️ DANGER: Rule 3 can give the WRONG lifetime!
fn search(&self, query: &str) -> &str {
// Compiler thinks output borrows from &self (Rule 3)
// But what if we want to return a slice of query?
// We'd need explicit annotation to override:
// fn search<'q>(&self, query: &'q str) -> &'q str
self.source
}
}
When Elision Fails — Must Annotate
// ❌ Case 1: Multiple input lifetimes, no &self
// fn longest(a: &str, b: &str) -> &str { ... }
// ^-- which lifetime?
// FIX:
fn longest<'a>(a: &'a str, b: &'a str) -> &'a str {
if a.len() > b.len() { a } else { b }
}
// ❌ Case 2: Struct with reference in return (no &self)
// fn make_excerpt(text: &str) -> Excerpt { ... }
// FIX:
fn make_excerpt<'a>(text: &'a str) -> Excerpt<'a> {
Excerpt { part: &text[..5] }
}
// ❌ Case 3: Want output to borrow from a DIFFERENT input than Rule 3 picks
impl MyStruct {
// Rule 3 would make output borrow from &self,
// but we actually return from `input`
fn transform<'a>(&self, input: &'a str) -> &'a str {
&input[1..]
}
}
Elision in impl Blocks and Trait Definitions
// Elision works in trait definitions too:
trait Summarize {
fn summary(&self) -> &str; // Rule 3: output borrows from self
}
// In impl blocks, each method follows the same three rules:
impl Summarize for Article {
fn summary(&self) -> &str { &self.headline }
}
Elision in static and const Contexts
// In static/const positions, elided lifetimes default to 'static:
const GREETING: &str = "Hello";
// This is actually: const GREETING: &'static str = "Hello";
// In trait object bounds, elision follows different rules:
// Box<dyn Trait> → Box<dyn Trait + 'static>
// &'a dyn Trait → &'a (dyn Trait + 'a)
// These defaults can be surprising and are worth knowing!
The Complete Elision Algorithm (Step by Step)
&self. If none of these produce a unique output lifetime, you must annotate. The rules can sometimes give the wrong lifetime (e.g., Rule 3 picks &self's lifetime when the output actually borrows from a different parameter), which is why understanding them deeply matters."
77. Atomic Operations Deep Dive Advanced
What Are Atomic Operations?
An atomic operation is indivisible — no other thread can observe it "half-done." When you write counter.fetch_add(1, Ordering::Relaxed), the read-modify-write happens as a single, uninterruptible CPU instruction. No thread will ever see a torn or partially-written value.
Atomics are built on hardware primitives: x86 uses LOCK-prefixed instructions and CMPXCHG; ARM uses LDXR/STXR (load-exclusive/store-exclusive). Rust's atomic types map directly to these.
The Complete Atomic Type Family
| Type | Size | Key Operations | Common Use |
|---|---|---|---|
AtomicBool |
1 byte | load, store, swap, compare_exchange, fetch_and, fetch_or, fetch_xor | Flags, shutdown signals, once-init guards |
AtomicU8 / AtomicI8 |
1 byte | All arithmetic + bitwise atomics | Compact state machines, bitmasks |
AtomicU16 / AtomicI16 |
2 bytes | All arithmetic + bitwise atomics | Compact counters, generation counters |
AtomicU32 / AtomicI32 |
4 bytes | All arithmetic + bitwise atomics | Reference counts, epoch counters |
AtomicU64 / AtomicI64 |
8 bytes | All arithmetic + bitwise atomics | Timestamps, large counters, packed state |
AtomicUsize / AtomicIsize |
pointer-sized | All arithmetic + bitwise atomics | Indices, counts, general-purpose |
AtomicPtr<T> |
pointer-sized | load, store, swap, compare_exchange | Lock-free data structures, RCU patterns |
Every Atomic Method Explained
use std::sync::atomic::{AtomicU64, Ordering};
let atom = AtomicU64::new(10);
// ── Read ──
let val = atom.load(Ordering::Acquire); // read current value: 10
// ── Write ──
atom.store(20, Ordering::Release); // set to 20
// ── Swap (exchange) — returns OLD value ──
let old = atom.swap(30, Ordering::AcqRel); // old = 20, now 30
// ── Arithmetic — all return the PREVIOUS value ──
let prev = atom.fetch_add(5, Ordering::Relaxed); // prev = 30, now 35
let prev = atom.fetch_sub(3, Ordering::Relaxed); // prev = 35, now 32
let prev = atom.fetch_max(50, Ordering::Relaxed); // prev = 32, now 50
let prev = atom.fetch_min(40, Ordering::Relaxed); // prev = 50, now 40
// ── Bitwise — all return the PREVIOUS value ──
let prev = atom.fetch_or(0xFF, Ordering::Relaxed); // bitwise OR
let prev = atom.fetch_and(0x0F, Ordering::Relaxed); // bitwise AND (mask)
let prev = atom.fetch_xor(0xFF, Ordering::Relaxed); // bitwise XOR (toggle)
// ── Compare-and-Exchange (CAS) — the fundamental primitive ──
let result = atom.compare_exchange(
40, // expected current value
99, // new value (only if current == expected)
Ordering::AcqRel, // ordering on success
Ordering::Acquire, // ordering on failure
);
// result: Ok(40) if swapped, Err(actual_value) if not
// compare_exchange_weak — can spuriously fail (faster in loops on ARM)
let result = atom.compare_exchange_weak(
99, 100, Ordering::AcqRel, Ordering::Relaxed
);
// ── fetch_update — CAS loop built-in (stable since 1.45) ──
atom.fetch_update(Ordering::AcqRel, Ordering::Acquire, |val| {
if val < 200 { Some(val * 2) } else { None } // None = abort
});
AtomicPtr — Lock-Free Pointer Swapping
use std::sync::atomic::{AtomicPtr, Ordering};
// ── Use Case: Hot-swappable config without locks ──
struct Config { max_connections: usize, timeout_ms: u64 }
static CURRENT_CONFIG: AtomicPtr<Config> = AtomicPtr::new(std::ptr::null_mut());
fn init_config() {
let cfg = Box::new(Config { max_connections: 100, timeout_ms: 5000 });
CURRENT_CONFIG.store(Box::into_raw(cfg), Ordering::Release);
}
fn read_config() -> &'static Config {
let ptr = CURRENT_CONFIG.load(Ordering::Acquire);
unsafe { &*ptr } // safe as long as we never free old configs
}
fn update_config(new_cfg: Config) {
let new_ptr = Box::into_raw(Box::new(new_cfg));
let old_ptr = CURRENT_CONFIG.swap(new_ptr, Ordering::AcqRel);
// ⚠️ old_ptr is leaked here! In production, use epoch-based
// reclamation (crossbeam-epoch) to safely free old configs
// after all readers are done.
}
// Real-world: RCU (Read-Copy-Update) pattern, lock-free linked lists,
// hazard pointers (crossbeam, flurry crate)
Atomic Fences
A fence (memory barrier) enforces ordering constraints without being tied to a specific atomic variable. It's a heavier tool — prefer per-variable orderings when possible.
use std::sync::atomic::{fence, Ordering};
// fence(Ordering::Acquire) — all subsequent reads/writes in this thread
// are guaranteed to see all writes that happened before a matching Release.
// fence(Ordering::Release) — all preceding reads/writes in this thread
// are visible to any thread that does a matching Acquire.
// Typical use: batch multiple Relaxed operations, then fence once
data1.store(val1, Ordering::Relaxed);
data2.store(val2, Ordering::Relaxed);
data3.store(val3, Ordering::Relaxed);
fence(Ordering::Release); // one fence for all three stores
ready.store(true, Ordering::Relaxed);
// Reader side:
if ready.load(Ordering::Relaxed) {
fence(Ordering::Acquire); // pairs with the Release fence above
// now guaranteed to see data1, data2, data3
}
Real-World Production Patterns
Pattern 1: Lock-Free Statistics Counter
use std::sync::atomic::{AtomicU64, Ordering};
struct Metrics {
requests_total: AtomicU64,
errors_total: AtomicU64,
bytes_processed: AtomicU64,
active_connections: AtomicU64, // can go up AND down
}
impl Metrics {
const fn new() -> Self {
Metrics {
requests_total: AtomicU64::new(0),
errors_total: AtomicU64::new(0),
bytes_processed: AtomicU64::new(0),
active_connections: AtomicU64::new(0),
}
}
fn record_request(&self, bytes: u64) {
self.requests_total.fetch_add(1, Ordering::Relaxed);
self.bytes_processed.fetch_add(bytes, Ordering::Relaxed);
}
fn record_error(&self) {
self.errors_total.fetch_add(1, Ordering::Relaxed);
}
fn conn_open(&self) {
self.active_connections.fetch_add(1, Ordering::Relaxed);
}
fn conn_close(&self) {
self.active_connections.fetch_sub(1, Ordering::Relaxed);
}
fn snapshot(&self) -> (u64, u64, u64, u64) {
// Relaxed is fine — we just want approximate stats
(
self.requests_total.load(Ordering::Relaxed),
self.errors_total.load(Ordering::Relaxed),
self.bytes_processed.load(Ordering::Relaxed),
self.active_connections.load(Ordering::Relaxed),
)
}
}
static METRICS: Metrics = Metrics::new(); // global, zero-cost init
// Real-world: prometheus metrics, tracing counters, health endpoints
Pattern 2: Graceful Shutdown Signal
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
struct ShutdownSignal {
flag: AtomicBool,
}
impl ShutdownSignal {
fn new() -> Arc<Self> {
Arc::new(ShutdownSignal { flag: AtomicBool::new(false) })
}
fn shutdown(&self) {
self.flag.store(true, Ordering::Release); // signal all workers
}
fn is_shutdown(&self) -> bool {
self.flag.load(Ordering::Acquire) // check signal
}
}
// Worker loop:
let signal = ShutdownSignal::new();
let s = Arc::clone(&signal);
std::thread::spawn(move || {
while !s.is_shutdown() {
// ... process next task ...
}
// cleanup and exit gracefully
});
// Real-world: tokio CancellationToken, actix system signals, worker pools
Pattern 3: Lock-Free ID Generator
use std::sync::atomic::{AtomicU64, Ordering};
struct IdGenerator {
next: AtomicU64,
}
impl IdGenerator {
const fn new() -> Self {
IdGenerator { next: AtomicU64::new(1) }
}
fn next_id(&self) -> u64 {
self.next.fetch_add(1, Ordering::Relaxed)
}
}
static IDS: IdGenerator = IdGenerator::new();
// IDS.next_id() — always unique, never blocks, no Mutex
// Real-world: request IDs, span IDs in tracing, database sequences
Pattern 4: Atomic Bitflags (State Machine)
use std::sync::atomic::{AtomicU8, Ordering};
const INITIALIZED: u8 = 0b0001;
const RUNNING: u8 = 0b0010;
const SHUTTING_DOWN:u8 = 0b0100;
const ERROR: u8 = 0b1000;
struct ServiceState {
flags: AtomicU8,
}
impl ServiceState {
fn set_running(&self) {
self.flags.fetch_or(RUNNING, Ordering::Release);
}
fn set_error(&self) {
self.flags.fetch_or(ERROR, Ordering::Release);
}
fn clear_error(&self) {
self.flags.fetch_and(!ERROR, Ordering::Release); // clear bit
}
fn is_running(&self) -> bool {
self.flags.load(Ordering::Acquire) & RUNNING != 0
}
fn has_error(&self) -> bool {
self.flags.load(Ordering::Acquire) & ERROR != 0
}
}
// Pack multiple boolean states into a single byte, all thread-safe
// Real-world: connection state machines, feature flags, health status
Pattern 5: Lock-Free Bounded Counter (with Saturation)
use std::sync::atomic::{AtomicUsize, Ordering};
struct Semaphore {
permits: AtomicUsize,
}
impl Semaphore {
fn new(max: usize) -> Self {
Semaphore { permits: AtomicUsize::new(max) }
}
fn try_acquire(&self) -> bool {
// CAS loop: decrement only if > 0
self.permits.fetch_update(
Ordering::AcqRel,
Ordering::Acquire,
|current| {
if current > 0 { Some(current - 1) } else { None }
}
).is_ok()
}
fn release(&self) {
self.permits.fetch_add(1, Ordering::Release);
}
}
// Real-world: connection pool limits, rate limiters, work-stealing queues
Atomics vs Mutex — When to Use Which
| Criteria | Use Atomics | Use Mutex |
|---|---|---|
| Data type | Single integer, bool, or pointer | Complex structs, strings, collections |
| Operation | One read-modify-write per step | Multiple fields updated together (need transaction) |
| Contention | Low-medium (CAS loops waste CPU under high contention) | High (threads sleep while waiting → no CPU waste) |
| Latency | Nanoseconds (no syscall) | Microseconds (OS kernel involved on contention) |
| Composability | Poor — can't atomically update two atomics together | Good — lock protects arbitrary critical section |
| Deadlock risk | None (no locks) | Yes (if multiple mutexes acquired in wrong order) |
| Starvation | Possible (CAS loop may retry indefinitely under extreme contention) | OS scheduler provides fairness |
// ✅ Good use of atomics: simple counter
static COUNTER: AtomicU64 = AtomicU64::new(0);
// ❌ Bad use of atomics: need to update two values together
// These two updates are NOT atomic as a pair:
// balance.fetch_sub(amount); // withdraw
// other_balance.fetch_add(amount); // deposit
// → Another thread could see money "disappear" between the two ops!
// Solution: use Mutex to protect the entire transfer:
// let mut accounts = lock.lock();
// accounts.from -= amount;
// accounts.to += amount;
Packed Atomics — Multiple Values in One Word
A powerful technique: pack multiple small values into one AtomicU64, then use CAS to update them atomically together.
use std::sync::atomic::{AtomicU64, Ordering};
// Pack a (generation: u32, index: u32) pair into one AtomicU64
struct VersionedIndex {
packed: AtomicU64,
}
impl VersionedIndex {
fn pack(gen: u32, idx: u32) -> u64 {
((gen as u64) << 32) | (idx as u64)
}
fn unpack(val: u64) -> (u32, u32) {
((val >> 32) as u32, val as u32)
}
// Atomically update index AND bump generation
fn update_index(&self, new_idx: u32) {
self.packed.fetch_update(Ordering::AcqRel, Ordering::Acquire, |old| {
let (gen, _) = Self::unpack(old);
Some(Self::pack(gen + 1, new_idx))
}).ok();
}
}
// Both gen and idx update atomically — no ABA problem!
// Real-world: lock-free queues, epoch-based reclamation, ABA prevention
Testing Atomics with Loom
loom is a testing tool that explores all possible thread interleavings to find concurrency bugs. It replaces std::sync::atomic with its own types that systematically test every ordering.
// In Cargo.toml:
// [dev-dependencies]
// loom = "0.7"
#[cfg(not(loom))]
use std::sync::atomic::AtomicUsize;
#[cfg(loom)]
use loom::sync::atomic::AtomicUsize;
#[test]
#[cfg(loom)]
fn test_concurrent_increment() {
loom::model(|| {
let counter = loom::sync::Arc::new(AtomicUsize::new(0));
let c1 = counter.clone();
let c2 = counter.clone();
let t1 = loom::thread::spawn(move || c1.fetch_add(1, Ordering::Relaxed));
let t2 = loom::thread::spawn(move || c2.fetch_add(1, Ordering::Relaxed));
t1.join().unwrap();
t2.join().unwrap();
assert_eq!(counter.load(Ordering::Relaxed), 2);
});
}
// loom exhaustively tests EVERY possible interleaving
// Real-world: crossbeam, tokio, parking_lot all use loom for testing
Platform Considerations
| Architecture | Behaviour |
|---|---|
| x86/x86_64 | Strong memory model — most orderings are "free." Acquire/Release compile to plain loads/stores. Only SeqCst adds an MFENCE. CAS uses LOCK CMPXCHG. |
| ARM / Apple Silicon | Weak memory model — Acquire/Release generate real barrier instructions (DMB). compare_exchange_weak can spuriously fail (LL/SC pair). Performance difference between orderings is real. |
| RISC-V | Weak model similar to ARM. Explicit fence instructions for barriers. Growing target for embedded Rust. |
| WebAssembly | Atomics require SharedArrayBuffer. Atomics.wait/notify for blocking. Ordering maps to wasm fence instructions. |
Relaxed for independent counters/stats, Acquire/Release pairs when one thread publishes data for another to read, and SeqCst only when I need a global total order. For complex state, I use Mutex instead — you can't atomically update two separate atomics together. I test concurrent code with loom to check all possible interleavings."
78. Multi-Paradigm Rust Core
The Three Paradigms in Rust
What Rust Takes from Functional Programming
Rust borrows heavily from the ML family of languages (OCaml, Haskell, F#). Many of Rust's most distinctive features are functional in origin:
// ── 1. Immutable by default ──
let x = 5; // immutable — functional default
let mut y = 5; // must explicitly opt into mutability
// ── 2. Expression-based — everything returns a value ──
let result = if condition { 1 } else { 2 }; // if is an expression
let val = match x { // match is an expression
0 => "zero",
_ => "other",
};
let block_val = { // blocks are expressions
let a = 10;
let b = 20;
a + b // no semicolon = return value
};
// ── 3. Algebraic Data Types (ADTs) ──
// Product types (structs) × Sum types (enums)
struct Point { x: f64, y: f64 } // product type (AND)
enum Shape { // sum type (OR)
Circle(f64),
Rect(f64, f64),
Triangle { a: f64, b: f64, c: f64 },
}
// ── 4. Pattern matching with exhaustiveness checking ──
fn area(s: &Shape) -> f64 {
match s {
Shape::Circle(r) => std::f64::consts::PI * r * r,
Shape::Rect(w, h) => w * h,
Shape::Triangle { a, b, c } => {
let s = (a + b + c) / 2.0;
(s * (s - a) * (s - b) * (s - c)).sqrt()
}
} // compiler FORCES you to handle all variants
}
// ── 5. Closures & higher-order functions ──
let add = |a, b| a + b; // closure
let apply = |f: &dyn Fn(i32, i32) -> i32, x, y| f(x, y); // HOF
// ── 6. Iterator chains (lazy, composable, zero-cost) ──
let sum_of_squares: i32 = (1..=10)
.filter(|x| x % 2 == 0) // keep evens
.map(|x| x * x) // square
.sum(); // fold
// No intermediate allocations — this compiles to a simple loop!
// ── 7. Option & Result — monadic error handling ──
fn parse_and_double(s: &str) -> Option<i32> {
s.parse::<i32>().ok() // Result → Option
.filter(|&n| n > 0) // chain with filter
.map(|n| n * 2) // chain with map
}
// No null, no exceptions — just types
// ── 8. No null — Option<T> instead ──
let maybe: Option<i32> = Some(42);
let nothing: Option<i32> = None;
// Must explicitly handle the None case — can't accidentally dereference null
What Rust Takes from Object-Oriented Programming
Rust supports encapsulation, polymorphism, and method dispatch — but rejects classical inheritance. It uses composition and traits instead of class hierarchies.
// ── 1. Structs + impl blocks = methods (like classes without inheritance) ──
struct BankAccount {
owner: String, // fields are private by default
balance: f64,
}
impl BankAccount {
// Associated function (like a static method / constructor)
fn new(owner: String, initial: f64) -> Self {
BankAccount { owner, balance: initial }
}
// Method (takes &self or &mut self)
fn deposit(&mut self, amount: f64) { self.balance += amount; }
fn balance(&self) -> f64 { self.balance } // getter
}
// ── 2. Traits = interfaces (with default implementations) ──
trait Drawable {
fn draw(&self);
fn bounding_box(&self) -> (f64, f64, f64, f64);
fn description(&self) -> String { // default impl
format!("Drawable at {:?}", self.bounding_box())
}
}
// ── 3. Polymorphism — BOTH static and dynamic ──
// Static dispatch (monomorphized, zero-cost — like C++ templates)
fn render_static(item: &impl Drawable) {
item.draw(); // compiler knows the concrete type → inlines
}
// Dynamic dispatch (vtable, runtime — like Java/C# interfaces)
fn render_dynamic(item: &dyn Drawable) {
item.draw(); // vtable lookup at runtime
}
// Heterogeneous collection (only possible with dynamic dispatch)
let shapes: Vec<Box<dyn Drawable>> = vec![
Box::new(circle),
Box::new(rectangle),
Box::new(triangle),
];
// ── 4. Encapsulation — pub visibility controls ──
mod engine {
pub struct Database {
pub name: String, // public field
connection_string: String, // private! (default)
}
impl Database {
pub fn connect(&self) { /* ... */ } // public method
fn internal(&self) { /* ... */ } // private method
}
}
// ── 5. Composition over inheritance ──
struct Car {
engine: Engine, // HAS-A engine (composition)
wheels: [Wheel; 4], // HAS wheels
}
// Rust has NO "class Car extends Vehicle" — you compose structs
// and implement traits to get polymorphism
What Rust Does NOT Have (and Why)
| Feature | Present in | Rust's Alternative | Why |
|---|---|---|---|
| Class inheritance | Java, C++, Python | Traits + composition | Inheritance creates tight coupling and the fragile base class problem. Traits give you polymorphism without the downsides. |
| Null / nil | Java, C, Go, JS | Option<T> |
Null references are "the billion-dollar mistake." Option forces explicit handling at compile time. |
| Exceptions | Java, Python, C++ | Result<T, E> + ? |
Exceptions have invisible control flow. Result makes errors visible in the type signature. |
| Garbage collector | Java, Go, JS, Python | Ownership + RAII | GC adds latency and unpredictability. Ownership gives deterministic deallocation with zero runtime cost. |
| Implicit conversions | C++, JS, Scala | Explicit From/Into |
Implicit conversions cause subtle bugs. Rust makes all conversions explicit and visible. |
| Higher-kinded types | Haskell, Scala | GATs (partial), workarounds | Full HKTs add enormous complexity. GATs cover most practical use cases. |
| Tail call optimisation | Haskell, Scheme, Erlang | Iterators, loops | Not guaranteed by LLVM. Use iterators for the same pattern without stack overflow risk. |
| Runtime reflection | Java, C#, Python | Macros + Any (limited) |
Reflection requires runtime metadata → binary bloat. Macros handle most metaprogramming at compile time. |
Idiomatic Rust: Blending Paradigms
Real Rust code freely mixes paradigms — even in a single function. The key insight is: Rust gives you functional idioms for data transformation, OOP idioms for API design, and systems idioms for performance-critical code.
// A real-world function mixing all three paradigms:
struct UserService { // OOP: struct with methods
db: Database,
cache: Arc<RwLock<HashMap<u64, User>>>, // Systems: explicit memory model
}
impl UserService {
// OOP: method on struct
fn active_premium_emails(&self) -> Result<Vec<String>, DbError> {
let users = self.db.all_users()?; // FP: Result + ? operator
let emails = users.into_iter() // FP: iterator chain
.filter(|u| u.is_active) // FP: closure
.filter(|u| u.subscription == Tier::Premium)
.map(|u| u.email) // FP: transform
.collect(); // FP: collect into Vec
Ok(emails) // FP: wrap in Result
}
}
// The trait (OOP interface) that makes this testable:
trait UserRepository {
fn all_users(&self) -> Result<Vec<User>, DbError>;
}
// In tests: impl UserRepository for MockDb { ... }
// In prod: impl UserRepository for PostgresDb { ... }
Paradigm Comparison: Same Problem, Three Styles
// Task: find the longest name from a list of users who are active
// ── Imperative / Systems style ──
fn longest_active_imperative(users: &[User]) -> Option<&str> {
let mut longest: Option<&str> = None;
for user in users {
if user.is_active {
match longest {
None => longest = Some(&user.name),
Some(prev) if user.name.len() > prev.len() => {
longest = Some(&user.name);
}
_ => {}
}
}
}
longest
}
// ── Functional / Iterator style ──
fn longest_active_functional(users: &[User]) -> Option<&str> {
users.iter()
.filter(|u| u.is_active)
.map(|u| u.name.as_str())
.max_by_key(|name| name.len())
}
// Both compile to nearly identical machine code!
// The functional version is more concise and harder to get wrong.
// The imperative version gives finer control for complex logic.
// Rust lets you choose the right style for each situation.
Traits vs Inheritance: Why Rust Made This Choice
// ── What inheritance gives you in Java: ──
// class Animal { void eat() {...} }
// class Dog extends Animal { void bark() {...} }
// class GuideDog extends Dog { void guide() {...} }
// Problem: "diamond of death", fragile base class, tight coupling
// ── What Rust gives you instead: traits (composable behaviors) ──
trait Eater { fn eat(&self); }
trait Barker { fn bark(&self); }
trait Guide { fn guide(&self, dest: &str); }
trait Swimmable { fn swim(&self); }
struct GuideDog { name: String }
// Implement exactly the traits you need — no forced hierarchy
impl Eater for GuideDog { fn eat(&self) { /* ... */ } }
impl Barker for GuideDog { fn bark(&self) { /* ... */ } }
impl Guide for GuideDog { fn guide(&self, dest: &str) { /* ... */ } }
// GuideDog is NOT Swimmable — and that's fine!
// Trait bounds = "this function works with anything that can eat AND bark"
fn feed_and_play(pet: &(impl Eater + Barker)) {
pet.eat();
pet.bark();
}
// Supertrait = trait that requires another trait
trait ServiceAnimal: Eater + Guide {
fn certification_id(&self) -> &str;
}
// Anyone implementing ServiceAnimal MUST also implement Eater + Guide
When to Use Which Style
| Situation | Best Paradigm | Why |
|---|---|---|
| Data transformation pipeline | Functional (iterators + closures) | Composable, no mutable state, zero-cost |
| API design / public interfaces | OOP (traits + structs + visibility) | Encapsulation, polymorphism, testability |
| Performance-critical hot loop | Systems (manual control, unsafe if needed) | Predictable layout, SIMD, cache-friendly |
| Error handling | Functional (Result + ? + combinators) | Composable, type-safe, no hidden control flow |
| State machines / protocols | OOP (typestate pattern with enums) | Compiler enforces valid transitions |
| FFI / interop with C | Systems (repr(C), unsafe, raw pointers) | Direct memory layout control required |
| Configuration / builders | OOP (builder pattern with method chaining) | Fluent API, optional fields, validation |
| Concurrent data sharing | Systems (Arc, Mutex, atomics) | Explicit ownership and synchronisation model |
79. Cryptography in Rust Advanced
Rust's type system and memory safety make it an excellent choice for cryptographic implementations — no buffer overflows, no use-after-free, and deterministic resource cleanup. The Rust crypto ecosystem is mature, audited, and widely used in production systems.
79.1 Ecosystem Overview
| Category | Crate | Purpose | Notes |
|---|---|---|---|
| Hashing | sha2, sha3, blake3 | Cryptographic hash functions | RustCrypto project |
| Symmetric Encryption | aes-gcm, chacha20poly1305 | Authenticated encryption (AEAD) | Preferred over raw block ciphers |
| Asymmetric Encryption | rsa, ed25519-dalek, x25519-dalek | Public-key crypto & key exchange | Ed25519 preferred for signatures |
| Password Hashing | argon2, bcrypt, scrypt | Key derivation / password storage | Argon2id recommended |
| TLS | rustls, native-tls | Transport layer security | rustls is pure Rust, no OpenSSL |
| Random | rand, getrandom | CSPRNG & OS entropy | Always use OsRng for crypto |
| ECDSA/secp256k1 | k256, p256 | Elliptic curve operations | See also section 59 |
| Certificates | rcgen, x509-parser | X.509 cert generation & parsing | Useful for mTLS |
| Multi-purpose | ring | Fast, audited core crypto | BoringSSL-derived, limited API |
79.2 Cryptographic Hashing
Hash functions produce a fixed-size digest from arbitrary input. Use SHA-256 for general purposes, BLAKE3 for performance, SHA-3 for post-quantum hedging.
use sha2::{Sha256, Digest};
use blake3;
// SHA-256 — one-shot
let hash = Sha256::digest(b"hello world");
println!("SHA-256: {:x}", hash);
// SHA-256 — streaming (for large data)
let mut hasher = Sha256::new();
hasher.update(b"hello ");
hasher.update(b"world");
let result = hasher.finalize();
// BLAKE3 — much faster, parallel-friendly
let hash = blake3::hash(b"hello world");
println!("BLAKE3: {}", hash.to_hex());
// BLAKE3 — keyed MAC (message authentication)
let key = [0u8; 32]; // Use a real key!
let mac = blake3::keyed_hash(&key, b"message");
79.3 Symmetric Encryption (AEAD)
Authenticated Encryption with Associated Data (AEAD) provides confidentiality + integrity + authenticity in one operation. Always prefer AEAD over raw ciphers.
use aes_gcm::{Aes256Gcm, Key, Nonce};
use aes_gcm::aead::{Aead, KeyInit, OsRng};
// AES-256-GCM — industry standard AEAD
let key = Aes256Gcm::generate_key(OsRng);
let cipher = Aes256Gcm::new(&key);
let nonce = Nonce::from_slice(b"unique nonce"); // 96-bit, MUST be unique per message
// Encrypt
let ciphertext = cipher.encrypt(nonce, b"secret data".as_ref())
.expect("encryption failed");
// Decrypt
let plaintext = cipher.decrypt(nonce, ciphertext.as_ref())
.expect("decryption failed — tampered?");
use chacha20poly1305::{ChaCha20Poly1305, Key, Nonce};
use chacha20poly1305::aead::{Aead, KeyInit, OsRng};
// ChaCha20-Poly1305 — faster on CPUs without AES-NI
let key = ChaCha20Poly1305::generate_key(OsRng);
let cipher = ChaCha20Poly1305::new(&key);
let nonce = Nonce::from_slice(b"unique nonce!"); // 96-bit
let ciphertext = cipher.encrypt(nonce, b"secret".as_ref())?;
let plaintext = cipher.decrypt(nonce, ciphertext.as_ref())?;
| Algorithm | Key Size | Nonce Size | Best For |
|---|---|---|---|
| AES-256-GCM | 256-bit | 96-bit | Hardware-accelerated (AES-NI), standard compliance |
| ChaCha20-Poly1305 | 256-bit | 96-bit | Software performance, embedded, mobile |
| XChaCha20-Poly1305 | 256-bit | 192-bit | Random nonces safe (larger nonce space) |
79.4 Password Hashing & Key Derivation
Never store passwords as plaintext or simple hashes. Use a memory-hard KDF. Argon2id is the current best practice (winner of the Password Hashing Competition).
use argon2::{Argon2, PasswordHasher, PasswordVerifier};
use argon2::password_hash::{SaltString, rand_core::OsRng};
// Hash a password
let salt = SaltString::generate(&mut OsRng);
let argon2 = Argon2::default();
let hash = argon2.hash_password(b"my-password", &salt)
.expect("hashing failed")
.to_string(); // PHC string format: $argon2id$v=19$m=...
// Verify a password
use argon2::password_hash::PasswordHash;
let parsed = PasswordHash::new(&hash).unwrap();
assert!(argon2.verify_password(b"my-password", &parsed).is_ok());
// Key derivation from password (for encryption keys)
let mut derived_key = [0u8; 32];
argon2.hash_password_into(b"password", salt.as_bytes(), &mut derived_key)
.expect("KDF failed");
| Algorithm | Recommended? | Notes |
|---|---|---|
| Argon2id | ✅ Yes — best choice | Memory-hard, GPU-resistant, PHC winner |
| bcrypt | ⚠️ Acceptable | Legacy, 72-byte password limit |
| scrypt | ⚠️ Acceptable | Memory-hard, harder to tune correctly |
| PBKDF2 | ❌ Avoid | Not memory-hard, GPU-vulnerable |
79.5 Asymmetric Cryptography (Public Key)
Used for digital signatures, key exchange, and encryption where parties don't share a secret.
// Ed25519 — fast, secure digital signatures
use ed25519_dalek::{SigningKey, Signature, Signer, Verifier};
use rand::rngs::OsRng;
// Generate keypair
let signing_key = SigningKey::generate(&mut OsRng);
let verifying_key = signing_key.verifying_key();
// Sign a message
let message = b"important document";
let signature: Signature = signing_key.sign(message);
// Verify signature
assert!(verifying_key.verify(message, &signature).is_ok());
// X25519 — Diffie-Hellman key exchange
use x25519_dalek::{EphemeralSecret, PublicKey};
// Alice generates her keypair
let alice_secret = EphemeralSecret::random_from_rng(OsRng);
let alice_public = PublicKey::from(&alice_secret);
// Bob generates his keypair
let bob_secret = EphemeralSecret::random_from_rng(OsRng);
let bob_public = PublicKey::from(&bob_secret);
// Both derive the same shared secret
let alice_shared = alice_secret.diffie_hellman(&bob_public);
let bob_shared = bob_secret.diffie_hellman(&alice_public);
assert_eq!(alice_shared.as_bytes(), bob_shared.as_bytes());
79.6 Secure Random Number Generation
use rand::{Rng, rngs::OsRng};
// ALWAYS use OsRng (or ThreadRng which wraps it) for crypto
let mut rng = OsRng;
// Random bytes
let mut key_bytes = [0u8; 32];
rng.fill(&mut key_bytes);
// Random number in range
let token: u64 = rng.gen();
// Generating tokens / session IDs
use base64::{Engine, engine::general_purpose::URL_SAFE_NO_PAD};
let mut token_bytes = [0u8; 32];
OsRng.fill(&mut token_bytes);
let token = URL_SAFE_NO_PAD.encode(token_bytes);
rand::thread_rng() seeded with a timestamp or rand::rngs::SmallRng for cryptographic purposes. Always use OsRng or ThreadRng (which uses OS entropy).
79.7 TLS with Rustls
rustls is a pure-Rust TLS implementation — no OpenSSL dependency, memory-safe, and modern (TLS 1.2/1.3 only).
use rustls::{ClientConfig, RootCertStore};
use std::sync::Arc;
// TLS client with system root certificates
let mut root_store = RootCertStore::empty();
root_store.extend(
webpki_roots::TLS_SERVER_ROOTS.iter().cloned()
);
let config = ClientConfig::builder()
.with_root_certificates(root_store)
.with_no_client_auth();
let config = Arc::new(config);
// Use with reqwest (HTTP client)
// reqwest::ClientBuilder::new()
// .use_rustls_tls()
// .build()?
// Use with tokio-rustls for raw TCP+TLS
// let connector = TlsConnector::from(config);
// let stream = connector.connect(domain, tcp_stream).await?;
79.8 HMAC (Hash-based Message Authentication)
use hmac::{Hmac, Mac};
use sha2::Sha256;
type HmacSha256 = Hmac<Sha256>;
// Create HMAC
let mut mac = HmacSha256::new_from_slice(b"secret-key")
.expect("key length ok");
mac.update(b"message to authenticate");
let result = mac.finalize();
let tag = result.into_bytes();
// Verify HMAC (constant-time comparison)
let mut mac = HmacSha256::new_from_slice(b"secret-key").unwrap();
mac.update(b"message to authenticate");
mac.verify_slice(&tag).expect("HMAC valid");
79.9 Zeroize — Secure Memory Cleanup
Sensitive data (keys, passwords) should be zeroed from memory when no longer needed. The compiler may optimize away naive zeroing — zeroize prevents this.
use zeroize::{Zeroize, ZeroizeOnDrop};
// Manual zeroize
let mut secret = vec![0x42u8; 32];
// ... use secret ...
secret.zeroize(); // Guaranteed to zero memory
// Auto-zeroize on drop
#[derive(ZeroizeOnDrop)]
struct PrivateKey {
bytes: [u8; 32],
}
// Key is automatically zeroed when dropped
79.10 Constant-Time Operations
Timing side-channels leak secrets. Always use constant-time comparison for secrets.
use subtle::ConstantTimeEq;
// ❌ WRONG — timing side-channel
if user_token == stored_token { /* ... */ }
// ✅ CORRECT — constant-time comparison
if user_token.ct_eq(&stored_token).into() { /* ... */ }
// The `subtle` crate provides constant-time:
// - ConstantTimeEq — equality check
// - ConditionallySelectable — select without branching
// - Choice — constant-time boolean
79.11 Common Cargo.toml for Crypto
# Cargo.toml dependencies for a typical crypto project
[dependencies]
sha2 = "0.10" # SHA-256, SHA-512
blake3 = "1" # Fast hashing
aes-gcm = "0.10" # AES-GCM AEAD
chacha20poly1305 = "0.10" # ChaCha20 AEAD
ed25519-dalek = "2" # Ed25519 signatures
x25519-dalek = "2" # X25519 key exchange
argon2 = "0.5" # Password hashing
hmac = "0.12" # HMAC
rand = "0.8" # CSPRNG
zeroize = "1" # Secure memory cleanup
subtle = "2" # Constant-time ops
rustls = "0.23" # Pure-Rust TLS
ring = "0.17" # Core crypto (alternative)
79.12 Security Best Practices
- Always use AEAD — never raw AES-CBC or AES-CTR. AEAD = encryption + authentication.
- Never reuse nonces — use XChaCha20 (192-bit nonce) if you can't guarantee uniqueness.
- Use Argon2id for passwords — at least 64 MB memory, 3 iterations.
- Zeroize secrets — derive
ZeroizeOnDropfor all key types. - Constant-time comparison — use
subtle::ConstantTimeEqfor any secret comparison. - Use OsRng — never seed your own PRNG for cryptographic purposes.
- Prefer audited crates —
ring, RustCrypto crates,dalekfamily are well-reviewed. - Pin dependencies — crypto crate updates can change behavior. Use
Cargo.lock. - Don't roll your own crypto — use established primitives and high-level APIs.
subtle crate. Crates like ring and the RustCrypto ecosystem provide audited, misuse-resistant APIs where incorrect usage often fails to compile rather than silently introducing vulnerabilities."
80. Hardware Security Modules (HSMs) Advanced
Hardware Security Modules are dedicated cryptographic processors that generate, store, and manage keys in tamper-resistant hardware. Keys never leave the HSM — all crypto operations happen on-device. Rust can interface with HSMs through PKCS#11, cloud KMS APIs, and vendor SDKs.
80.1 HSM Overview & Why They Matter
| Property | Software Keys | HSM Keys |
|---|---|---|
| Key Storage | In memory / on disk (extractable) | Inside tamper-resistant hardware (non-extractable) |
| Side-Channel Resistance | Vulnerable (timing, power analysis) | Hardened against physical attacks |
| Compliance | Generally insufficient for FIPS/PCI | FIPS 140-2/3 Level 2-4, PCI DSS, eIDAS |
| Key Lifecycle | Manual rotation, backup complexity | Built-in rotation, secure backup, audit logging |
| Performance | CPU-bound | Dedicated crypto accelerators (RSA, ECC, AES) |
80.2 Types of HSMs
| Type | Examples | Use Case | Rust Access |
|---|---|---|---|
| Network HSM | Thales Luna, Entrust nShield | Enterprise, PKI, banking | PKCS#11 over network |
| PCIe HSM | Thales Luna PCIe, Utimaco | High-throughput server-side | PKCS#11 local |
| USB HSM | YubiHSM 2, Nitrokey HSM | Small-scale, dev/test, code signing | Native SDK or PKCS#11 |
| Cloud HSM | AWS CloudHSM, Azure Dedicated HSM | Cloud-native key management | PKCS#11 or cloud SDK |
| Cloud KMS | AWS KMS, GCP KMS, Azure Key Vault | Managed key ops (not full HSM) | REST API / SDK |
| Smart Cards / TPM | TPM 2.0, PIV cards | Device attestation, user auth | tss-esapi, pcsc |
80.3 PKCS#11 — The Standard HSM Interface
PKCS#11 (Cryptoki) is the industry-standard C API for HSMs. Rust interacts via FFI bindings.
// Using the `cryptoki` crate (safe Rust wrapper for PKCS#11)
use cryptoki::{context::{CInitializeArgs, Pkcs11}, session::UserType, types::AuthPin};
use cryptoki::mechanism::Mechanism;
use cryptoki::object::{Attribute, AttributeType};
// Initialize PKCS#11 library (path to vendor .so/.dylib)
let pkcs11 = Pkcs11::new("/usr/lib/softhsm/libsofthsm2.so")?;
pkcs11.initialize(CInitializeArgs::OsThreads)?;
// List available slots (each slot = one HSM partition)
let slots = pkcs11.get_slots_with_token()?;
let slot = slots[0];
// Open session and login
let session = pkcs11.open_rw_session(slot)?;
session.login(UserType::User, Some(&AuthPin::new("1234".into())))?;
// Generate RSA keypair ON the HSM
let (pub_key, priv_key) = session.generate_key_pair(
&Mechanism::RsaPkcsKeyPairGen,
&[
Attribute::ModulusBits(2048.into()),
Attribute::Token(true), // Persist on HSM
Attribute::Verify(true),
],
&[
Attribute::Token(true),
Attribute::Private(true), // Requires login to use
Attribute::Sign(true),
Attribute::Sensitive(true), // Cannot be extracted
],
)?;
// Sign data using the HSM (private key never leaves hardware)
let data = b"data to sign";
let signature = session.sign(&Mechanism::RsaPkcs, priv_key, data)?;
// Verify signature
session.verify(&Mechanism::RsaPkcs, pub_key, data, &signature)?;
// Cleanup
session.logout()?;
80.4 YubiHSM 2 — Affordable USB HSM
// Using the `yubihsm` crate (native SDK)
use yubihsm::{Client, Connector, Credentials, UsbConfig};
use yubihsm::ecdsa;
// Connect to YubiHSM 2 via USB
let connector = Connector::usb(&UsbConfig::default());
let client = Client::open(connector, Credentials::default(), true)?;
// Generate Ed25519 key on device
let key_id = client.generate_asymmetric_key(
0x0001, // Key ID
"My signing key".into(), // Label
vec![1], // Domains
yubihsm::asymmetric::Algorithm::Ed25519,
)?;
// Sign with key that never leaves the HSM
let signature = client.sign_ed25519(0x0001, b"message")?;
80.5 Cloud KMS Integration
// AWS KMS via aws-sdk-kms
use aws_sdk_kms::{Client, types::{SigningAlgorithmSpec, MessageType}};
let config = aws_config::load_from_env().await;
let client = Client::new(&config);
// Sign with KMS-managed key
let result = client.sign()
.key_id("arn:aws:kms:us-east-1:123456:key/abcd-1234")
.message(Blob::new(b"hash-of-data"))
.message_type(MessageType::Digest)
.signing_algorithm(SigningAlgorithmSpec::EcdsaSha256)
.send().await?;
let signature = result.signature().unwrap();
// Encrypt with KMS (envelope encryption pattern)
let encrypt_result = client.encrypt()
.key_id("alias/my-key")
.plaintext(Blob::new(b"secret data"))
.send().await?;
80.6 TPM 2.0 — Trusted Platform Module
// Using the `tss-esapi` crate for TPM 2.0
use tss_esapi::{Context, TctiNameConf};
use tss_esapi::interface_types::resource_handles::Hierarchy;
// Connect to TPM
let tcti = TctiNameConf::from_environment_variable()?;
let mut context = Context::new(tcti)?;
// Get random bytes from TPM hardware RNG
let random_bytes = context.get_random(32)?;
// Create primary key in TPM
// (Keys are bound to the TPM — cannot be extracted)
// context.create_primary(Hierarchy::Owner, &key_template)?;
80.7 SoftHSM for Development
SoftHSM is a software-only HSM for development and testing. It implements PKCS#11 without real hardware.
# Install SoftHSM
$ sudo apt install softhsm2
$ softhsm2-util --init-token --slot 0 --label "dev" --pin 1234 --so-pin 5678
# Set PKCS#11 library path
$ export PKCS11_LIB=/usr/lib/softhsm/libsofthsm2.so
# Now the `cryptoki` code from 80.3 works against SoftHSM
80.8 HSM Architecture Patterns
- Envelope Encryption: HSM encrypts a data key → data key encrypts your data. HSM only handles small operations, data encryption is fast in software.
- Key Wrapping: Export keys wrapped (encrypted) by another HSM key for backup or transport between HSMs.
- Multi-Party Key Ceremony: Split HSM admin credentials among multiple people (M-of-N threshold) for root key generation.
- Key Rotation: Periodically generate new keys, re-encrypt data with new key, retire old key. HSMs track key lifecycle metadata.
- Attestation: TPMs can prove a system's boot state hasn't been tampered with (measured boot / sealed storage).
| Crate | HSM Type | Notes |
|---|---|---|
cryptoki | Any PKCS#11 HSM | Safe Rust wrapper, most universal |
yubihsm | YubiHSM 2 | Native SDK, USB or HTTP connector |
aws-sdk-kms | AWS KMS / CloudHSM | Async, tokio-based |
tss-esapi | TPM 2.0 | Low-level TPM access |
pcsc | Smart cards (PIV, OpenPGP) | PC/SC interface |
cryptoki crate provides safe PKCS#11 bindings, and cloud KMS SDKs offer managed HSM-backed operations. The typical pattern is envelope encryption: the HSM handles a small master key, which wraps data keys used for bulk encryption in software."
81. Asymmetric Encryption Advanced
Asymmetric (public-key) encryption uses mathematically related key pairs: a public key anyone can have and a private key kept secret. It enables encryption without pre-shared secrets, digital signatures, and key agreement. This section covers the full landscape beyond ECDSA (see section 59).
81.1 Asymmetric Algorithms Comparison
| Algorithm | Type | Key Size | Speed | Use Case | Rust Crate |
|---|---|---|---|---|---|
| RSA | Encryption + Signatures | 2048–4096 bit | Slow | Legacy, TLS, code signing | rsa |
| Ed25519 | Signatures only | 256 bit | Very Fast | Modern signatures, SSH, JWT | ed25519-dalek |
| X25519 | Key Exchange only | 256 bit | Very Fast | Diffie-Hellman, TLS 1.3 | x25519-dalek |
| ECDSA (P-256) | Signatures only | 256 bit | Fast | TLS, AWS, FIPS compliance | p256 |
| ECDSA (secp256k1) | Signatures only | 256 bit | Fast | Bitcoin, Ethereum | k256 |
| ECDH (P-256) | Key Exchange only | 256 bit | Fast | TLS, FIPS-compliant key exchange | p256 |
| ML-KEM (Kyber) | Key Encapsulation | ~1568 bytes | Fast | Post-quantum key exchange | ml-kem |
| ML-DSA (Dilithium) | Signatures only | ~2560 bytes | Fast | Post-quantum signatures | ml-dsa |
81.2 RSA Encryption & Signatures
use rsa::{RsaPrivateKey, RsaPublicKey};
use rsa::pkcs1v15::{SigningKey, VerifyingKey};
use rsa::{Oaep, Pkcs1v15Encrypt};
use sha2::Sha256;
use signature::{Signer, Verifier};
use rand::rngs::OsRng;
// Generate 2048-bit RSA keypair
let private_key = RsaPrivateKey::new(&mut OsRng, 2048)?;
let public_key = RsaPublicKey::from(&private_key);
// ── RSA-OAEP Encryption (preferred over PKCS#1 v1.5) ──
let padding = Oaep::new::<Sha256>();
let ciphertext = public_key.encrypt(&mut OsRng, padding, b"secret")?;
let plaintext = private_key.decrypt(Oaep::new::<Sha256>(), &ciphertext)?;
// ── RSA Signatures ──
let signing_key = SigningKey::<Sha256>::new(private_key);
let signature = signing_key.sign(b"message to sign");
let verifying_key = VerifyingKey::<Sha256>::from(&public_key);
verifying_key.verify(b"message to sign", &signature)?;
81.3 Hybrid Encryption Pattern
Asymmetric crypto is slow and has size limits. The standard pattern is hybrid encryption: encrypt data with a symmetric key, then encrypt that key with the recipient's public key.
use aes_gcm::{Aes256Gcm, Key, Nonce};
use aes_gcm::aead::{Aead, KeyInit, OsRng};
use rsa::{RsaPublicKey, RsaPrivateKey, Oaep};
use sha2::Sha256;
use rand::Rng;
// ── ENCRYPT (sender) ──
fn hybrid_encrypt(
recipient_pub: &RsaPublicKey,
plaintext: &[u8],
) -> Result<(Vec<u8>, Vec<u8>, [u8; 12]), Box<dyn std::error::Error>> {
// 1. Generate random symmetric key
let sym_key = Aes256Gcm::generate_key(OsRng);
let cipher = Aes256Gcm::new(&sym_key);
// 2. Encrypt data with symmetric key
let mut nonce_bytes = [0u8; 12];
OsRng.fill(&mut nonce_bytes);
let nonce = Nonce::from_slice(&nonce_bytes);
let ciphertext = cipher.encrypt(nonce, plaintext)?;
// 3. Encrypt symmetric key with RSA public key
let encrypted_key = recipient_pub.encrypt(
&mut OsRng, Oaep::new::<Sha256>(), sym_key.as_slice()
)?;
Ok((encrypted_key, ciphertext, nonce_bytes))
}
// ── DECRYPT (recipient) ──
fn hybrid_decrypt(
priv_key: &RsaPrivateKey,
encrypted_key: &[u8],
ciphertext: &[u8],
nonce_bytes: &[u8; 12],
) -> Result<Vec<u8>, Box<dyn std::error::Error>> {
// 1. Decrypt symmetric key with RSA private key
let sym_key_bytes = priv_key.decrypt(Oaep::new::<Sha256>(), encrypted_key)?;
let sym_key = Key::<Aes256Gcm>::from_slice(&sym_key_bytes);
// 2. Decrypt data with symmetric key
let cipher = Aes256Gcm::new(sym_key);
let nonce = Nonce::from_slice(nonce_bytes);
let plaintext = cipher.decrypt(nonce, ciphertext)?;
Ok(plaintext)
}
81.4 ECIES — Elliptic Curve Integrated Encryption
ECIES combines ECDH key agreement with symmetric encryption. More efficient than RSA hybrid for modern systems.
use x25519_dalek::{EphemeralSecret, PublicKey, StaticSecret};
use hkdf::Hkdf;
use sha2::Sha256;
use chacha20poly1305::{ChaCha20Poly1305, Key, Nonce};
use chacha20poly1305::aead::{Aead, KeyInit};
// Recipient has a long-term keypair
let recipient_secret = StaticSecret::random_from_rng(OsRng);
let recipient_public = PublicKey::from(&recipient_secret);
// ── ENCRYPT (sender) ──
// 1. Generate ephemeral keypair
let ephemeral_secret = EphemeralSecret::random_from_rng(OsRng);
let ephemeral_public = PublicKey::from(&ephemeral_secret);
// 2. ECDH → shared secret
let shared = ephemeral_secret.diffie_hellman(&recipient_public);
// 3. KDF → derive encryption key from shared secret
let hkdf = Hkdf::<Sha256>::new(None, shared.as_bytes());
let mut enc_key = [0u8; 32];
hkdf.expand(b"ecies-encryption", &mut enc_key)?;
// 4. Encrypt with derived key
let cipher = ChaCha20Poly1305::new(Key::from_slice(&enc_key));
let nonce = Nonce::from_slice(b"unique nonce!");
let ciphertext = cipher.encrypt(nonce, b"secret message".as_ref())?;
// Send: (ephemeral_public, nonce, ciphertext) to recipient
// ── DECRYPT (recipient) ──
let shared = recipient_secret.diffie_hellman(&ephemeral_public);
// Same KDF + decrypt steps...
81.5 Key Serialization & PEM/DER Formats
use rsa::{RsaPrivateKey, RsaPublicKey};
use rsa::pkcs8::{EncodePrivateKey, DecodePrivateKey, EncodePublicKey, DecodePublicKey};
// Export to PEM (text format, base64-encoded)
let pem = private_key.to_pkcs8_pem(rsa::pkcs8::LineEnding::Lf)?;
std::fs::write("private.pem", pem.as_bytes())?;
// Export to DER (binary format)
let der = private_key.to_pkcs8_der()?;
std::fs::write("private.der", der.as_bytes())?;
// Import from PEM
let pem_data = std::fs::read_to_string("private.pem")?;
let key = RsaPrivateKey::from_pkcs8_pem(&pem_data)?;
// Public key export/import
let pub_pem = public_key.to_public_key_pem(rsa::pkcs8::LineEnding::Lf)?;
let pub_key = RsaPublicKey::from_public_key_pem(&pub_pem)?;
81.6 Digital Certificates (X.509)
// Generate self-signed certificate with `rcgen`
use rcgen::{CertificateParams, KeyPair, generate_simple_self_signed};
let subject_alt_names = vec!["localhost".to_string(), "127.0.0.1".to_string()];
let cert = generate_simple_self_signed(subject_alt_names)?;
// PEM-encoded certificate and private key
let cert_pem = cert.cert.pem();
let key_pem = cert.key_pair.serialize_pem();
// Parse X.509 certificate
use x509_parser::prelude::*;
let (_rem, cert) = X509Certificate::from_der(&der_bytes)?;
println!("Subject: {}", cert.subject());
println!("Issuer: {}", cert.issuer());
println!("Valid: {} to {}", cert.validity().not_before, cert.validity().not_after);
81.7 Post-Quantum Cryptography
Quantum computers threaten RSA and ECC. NIST has standardized ML-KEM (Kyber) and ML-DSA (Dilithium) as post-quantum replacements.
// ML-KEM (Kyber) — Post-quantum key encapsulation
use ml_kem::{MlKem768, KemCore};
// Key generation
let (dk, ek) = MlKem768::generate(&mut OsRng);
// Encapsulate (sender creates shared secret + ciphertext)
let (ct, shared_secret_sender) = ek.encapsulate(&mut OsRng)?;
// Decapsulate (recipient recovers shared secret)
let shared_secret_recipient = dk.decapsulate(&ct)?;
// Both shared secrets are identical — use as symmetric key
assert_eq!(shared_secret_sender, shared_secret_recipient);
81.8 The ring Crate — High-Performance Core Crypto
use ring::{signature, rand::SystemRandom};
// Ed25519 with ring (BoringSSL-derived, heavily optimized)
let rng = SystemRandom::new();
let pkcs8 = signature::Ed25519KeyPair::generate_pkcs8(&rng)?;
let key_pair = signature::Ed25519KeyPair::from_pkcs8(pkcs8.as_ref())?;
let sig = key_pair.sign(b"message");
// Verify
let pub_key = signature::UnparsedPublicKey::new(
&signature::ED25519,
key_pair.public_key().as_ref(),
);
pub_key.verify(b"message", sig.as_ref())?;
81.9 When to Use What
| Need | Algorithm | Why |
|---|---|---|
| Fast signatures | Ed25519 | Small keys, deterministic, fast |
| FIPS-compliant signatures | ECDSA P-256 | Required by US government systems |
| Blockchain signatures | ECDSA secp256k1 | Bitcoin/Ethereum standard |
| Key exchange (modern) | X25519 | Fast, safe defaults, used in TLS 1.3 |
| Encryption to a public key | RSA-OAEP or ECIES | RSA for legacy; ECIES for modern |
| Future-proofing | ML-KEM + X25519 hybrid | Quantum-resistant + classical fallback |
| Legacy compatibility | RSA 2048+ | Widely supported, well understood |
signature trait crate provides a universal interface (Signer/Verifier) across all signature algorithms — Ed25519, ECDSA, RSA — so you can write algorithm-agnostic code and swap implementations without changing business logic. For encryption, the standard pattern is hybrid: use asymmetric crypto only to establish a shared symmetric key (via RSA-OAEP, ECDH, or KEM), then encrypt bulk data with AES-GCM or ChaCha20-Poly1305. This gives you the best of both worlds — key distribution from asymmetric and performance from symmetric."
82. ECDSA vs secp256k1 — Deep Comparison Advanced
A common source of confusion: ECDSA is a signature algorithm (a recipe), while secp256k1 is a specific elliptic curve (an ingredient). ECDSA can run on any suitable elliptic curve, and secp256k1 can be used with algorithms other than ECDSA (e.g., Schnorr). They are independent concepts that are often used together.
82.1 The Core Distinction
82.2 Side-by-Side Comparison
| Aspect | ECDSA | secp256k1 |
|---|---|---|
| What is it? | A signature algorithm | An elliptic curve |
| Category | Procedure / Protocol | Mathematical object / Parameters |
| Defined by | ANSI X9.62 / FIPS 186-4 | SEC 2 (Standards for Efficient Cryptography) |
| Can exist without the other? | Yes — works on P-256, P-384, Brainpool, etc. | Yes — used with Schnorr, ECDH, etc. |
| Role in Bitcoin | The signing algorithm (being replaced by Schnorr) | The curve (stays the same) |
| Role in Ethereum | The signing algorithm | The curve |
| Key generation | Not involved (that's the curve's job) | Defines the group, generator point G, order n |
| Signature output | Produces (r, s) pair | Defines the field/group these values live in |
| Rust trait | ecdsa::SigningKey<C> — generic over curve C | k256::Secp256k1 — a specific curve type |
82.3 Elliptic Curves Compared
secp256k1 is just one of many curves ECDSA can use. Here's how the common curves differ:
| Curve | Equation | Key Bits | Security Level | Primary Use | FIPS Approved |
|---|---|---|---|---|---|
| secp256k1 | y² = x³ + 7 | 256 | 128-bit | Bitcoin, Ethereum | ❌ No |
| P-256 (secp256r1) | y² = x³ + ax + b (NIST params) | 256 | 128-bit | TLS, WebAuthn, AWS | ✅ Yes |
| P-384 (secp384r1) | y² = x³ + ax + b (NIST params) | 384 | 192-bit | Government / high-assurance | ✅ Yes |
| Curve25519 | y² = x³ + 486662x² + x | 256 | 128-bit | EdDSA (Ed25519), X25519 | ❌ (but widely trusted) |
82.4 Algorithms That Use secp256k1
The secp256k1 curve isn't limited to ECDSA. Multiple cryptographic algorithms can operate on it:
| Algorithm | Type | Standard | Curve | Used By |
|---|---|---|---|---|
| ECDSA | Signature | ANSI X9.62 | secp256k1 | Bitcoin (legacy), Ethereum |
| Schnorr | Signature | BIP-340 | secp256k1 | Bitcoin Taproot (since 2021) |
| ECDH | Key Exchange | ANSI X9.63 | secp256k1 | Encrypted messaging (rare) |
| ECIES | Encryption | IEEE P1363a | secp256k1 | Ethereum ECIES (whisper, etc.) |
| MuSig2 | Multi-signature | BIP-327 | secp256k1 | Bitcoin multi-sig |
| Threshold Sigs | Distributed signing | FROST | secp256k1 | Custody, MPC wallets |
82.5 ECDSA on Different Curves in Rust
// The `ecdsa` crate provides a GENERIC ECDSA implementation
// parameterized over the curve. The curve is a type parameter.
// ── ECDSA + secp256k1 (Bitcoin/Ethereum) ──
use k256::ecdsa::{SigningKey as K256SigningKey, Signature as K256Signature};
use k256::ecdsa::signature::Signer;
let sk = K256SigningKey::random(&mut OsRng);
let sig: K256Signature = sk.sign(b"hello"); // ECDSA on secp256k1
// ── ECDSA + P-256 (TLS/WebAuthn/FIPS) ──
use p256::ecdsa::{SigningKey as P256SigningKey, Signature as P256Signature};
use p256::ecdsa::signature::Signer;
let sk = P256SigningKey::random(&mut OsRng);
let sig: P256Signature = sk.sign(b"hello"); // ECDSA on P-256
// Same algorithm (ECDSA), same API, different curve!
// The `Signer` trait is identical — only the type parameter changes.
82.6 Schnorr vs ECDSA (Both on secp256k1)
Bitcoin originally used ECDSA on secp256k1. Taproot (2021) introduced Schnorr signatures, also on secp256k1. Same curve, different algorithm.
| Property | ECDSA (secp256k1) | Schnorr (secp256k1) |
|---|---|---|
| Standard | ANSI X9.62 | BIP-340 |
| Signature Size | ~72 bytes (DER) / 64 bytes (compact) | 64 bytes |
| Linearity | Not linear (complex math for multi-sig) | Linear (signatures can be aggregated) |
| Multi-signature | Requires complex MPC protocols | Native: MuSig2 — sum signatures trivially |
| Batch Verification | Not efficiently batchable | ~2x faster batch verification |
| Provable Security | Proven secure in generic group model | Proven secure in ROM (stronger proof) |
| Malleability | Malleable without low-S normalization | Not malleable by design |
| Bitcoin Support | Since genesis (2009) | Since Taproot activation (Nov 2021) |
// Schnorr signatures on secp256k1 in Rust
use secp256k1::{Secp256k1, Keypair, Message};
use secp256k1::rand::rngs::OsRng;
use secp256k1::hashes::{Hash, sha256};
let secp = Secp256k1::new();
let keypair = Keypair::new(&secp, &mut OsRng);
let msg_hash = sha256::Hash::hash(b"Taproot transaction");
let message = Message::from_digest(msg_hash.to_byte_array());
// Schnorr sign (BIP-340) — same curve as ECDSA, different algorithm
let schnorr_sig = secp.sign_schnorr_no_aux_rand(&message, &keypair);
// ECDSA sign — same keypair, same curve, different algorithm
let ecdsa_sig = secp.sign_ecdsa(&message, &keypair.secret_key());
// Both signatures use secp256k1 — the curve is constant,
// only the signing algorithm differs!
// Verify Schnorr (uses x-only public key)
let (xonly_pk, _) = keypair.x_only_public_key();
secp.verify_schnorr(&schnorr_sig, &message, &xonly_pk)?;
// Verify ECDSA (uses full public key)
secp.verify_ecdsa(&message, &ecdsa_sig, &keypair.public_key())?;
82.7 Generic ECDSA Code in Rust
The RustCrypto ecdsa crate lets you write curve-agnostic signature code:
use ecdsa::{SigningKey, Signature, SignatureSize};
use ecdsa::signature::{Signer, Verifier};
use elliptic_curve::{CurveArithmetic, FieldBytesSize};
// Generic function: works with ANY ECDSA-compatible curve
fn sign_and_verify<C>(message: &[u8]) -> bool
where
C: CurveArithmetic + ecdsa::PrimeCurve,
SignatureSize<C>: ArrayLength<u8>,
SigningKey<C>: Signer<Signature<C>>,
{
let sk = SigningKey::<C>::random(&mut OsRng);
let vk = sk.verifying_key();
let sig: Signature<C> = sk.sign(message);
vk.verify(message, &sig).is_ok()
}
// Call with different curves — same function, different math underneath
let ok_k256 = sign_and_verify::<k256::Secp256k1>(b"test"); // Bitcoin curve
let ok_p256 = sign_and_verify::<p256::NistP256>(b"test"); // NIST curve
82.8 Decision Matrix
| Scenario | Algorithm | Curve | Rust Crate |
|---|---|---|---|
| Bitcoin transactions | Schnorr (Taproot) or ECDSA (legacy) | secp256k1 | secp256k1 |
| Ethereum transactions | ECDSA | secp256k1 | k256 |
| TLS certificates | ECDSA | P-256 or P-384 | p256, ring |
| WebAuthn / FIDO2 | ECDSA | P-256 | p256 |
| SSH keys | EdDSA (not ECDSA) | Ed25519 (not secp256k1) | ed25519-dalek |
| JWT signing | ECDSA (ES256) | P-256 | p256, ring |
| WASM-friendly blockchain | ECDSA | secp256k1 | k256 (no C deps) |
| FIPS 140-2 compliance | ECDSA | P-256 or P-384 | ring, aws-lc-rs |
| Custom blockchain | EdDSA (simpler, faster) | Ed25519 | ed25519-dalek |
| Multi-party signing | Schnorr (MuSig2) or FROST | secp256k1 | secp256k1 |
82.9 The Rust Crate Hierarchy
82.10 Common Misconceptions
- "secp256k1 is a signing algorithm" — No. It's a curve. ECDSA and Schnorr are the signing algorithms that run on secp256k1.
- "ECDSA only works with secp256k1" — No. ECDSA works on P-256, P-384, Brainpool, and many other curves. secp256k1 is popular because of Bitcoin.
- "Ed25519 is ECDSA on Curve25519" — No. Ed25519 uses EdDSA (Edwards-curve DSA), a different algorithm with different math and properties. It uses the Edwards form of Curve25519.
- "secp256k1 and secp256r1 are interchangeable" — No. secp256k1 (k = Koblitz) and secp256r1 (r = random / NIST P-256) are completely different curves with different parameters, security properties, and ecosystems.
- "You need secp256k1 for all blockchain work" — No. Many modern blockchains (Solana, Cosmos via Ed25519) and L2s use different curves entirely.
ecdsa crate provides a generic SigningKey<C> parameterized over any curve, while k256 and p256 supply the specific curve implementations. This lets you write curve-agnostic code via traits and swap curves at the type level."