Mor Shonrai

A blog covering everything from programming, data science, astronomy and anything that pops into my head.

Rust Tutorial Part I

This tutorial was part of a workshop I gave to graduate level astronomy students in the run up to the 2023 Advent of Code. The idea of this tutorial wasn’t to teach the nitty gritty of the Rust language, but rather to get someone who is somewhat experienced with programming, familiar enough with Rust to start writing code. It focuses more on “hows” than “whys”, assuming that you don’t need a computer science background to understand how to start writing code.

By the end of this tutorial series, you will have gained enough knowledge of Rust to begin developing high-performing code. This tutorial will be broken up into sections. The first section (this section) focuses on the basics of Rust and the borrow checker.

I encourage you to work through the examples at your own pace, attempting to solve issues before reviewing the solutions. For your convenience, unanswered problems are available on GitHub, with the solutions provided in the “solutions” branch.

What is Rust?

Rust stands out as a high-performance and memory-efficient programming language that emerged from the endeavors of Mozilla’s research employees (Rust, not Firefox, is Mozilla’s greatest industry contribution).

Prioritizing performance and memory safety, Rust utilizes a robust type system and an innovative “ownership” model to guarantee memory safety and thread safety at compile time.

This approach aims to address vulnerabilities arising from memory errors, with estimates from Microsoft suggesting that approximately 70% of code vulnerabilities stem from memory-related issues (source).

Rust’s ability to produce fast, efficient, and resilient code has catapulted it to the top of the list as the most admired language amongst developers. The community of Rust programmers affectionately refers to themselves as “Rustaceans,” and the language’s unofficial mascot, Ferris the crab:

Ferris the Crab (from https://Rustacean.net). Ferris being a reference to ferrous, a compound that contains iron.

Rust Compared to Python

When comparing Rust to a language like Python, several key differences become apparent:

  • Performance: Rust is renowned for its high performance, often being comparable to languages like C or C++. Python, on the other hand, tends to be significantly slower than Rust, emphasizing ease of development over raw performance.
  • Type System: Python is a dynamically typed language, meaning the interpreter infers variable types at runtime, allowing flexibility but increasing the potential for type-related errors. In contrast, Rust requires variables to have known types at compile time, enhancing safety and allowing for optimizations to be made by the compiler.
  • Compilation: Rust is a compiled language, while Python is interpreted. Python code is executed by an interpreter, converting code to bytecode at runtime, resulting in slower performance. Rust, as a compiled language, produces machine code binaries before runtime, reducing overhead and enabling compiler optimizations for faster, more memory-efficient execution.
  • Memory Management: Python employs a “garbage collector” to manage memory by periodically checking and freeing memory occupied by variables that go out of scope, impacting speed and memory efficiency. Rust utilizes an “ownership” memory model enforced by the “borrow checker” at compile time. Each variable in Rust has a single owner, and memory is automatically freed when the owner goes out of scope. This approach, without a costly garbage collector, contributes to Rust’s fast runtime.
  • Thread Safety: Python’s Global Interpreter Lock (GIL) allows only one CPU-bound thread to execute at a time, ensuring safety across threads but causing a bottleneck for parallel code execution. In Rust, the ownership model, combined with allowing either numerous immutable references or a single mutable reference at a time, guarantees thread safety at compile time without the restrictions posed by a GIL.
  • Package Management: Both Python and Rust use package management systems to handle external libraries or crates (in Rust). Rust utilizes the cargo package manager and toml files to manage project dependencies, while Python uses tools like pip and requirements.txt to manage packages.

Rust and Python employ different approaches to achieve their goals, with Rust focusing on performance, memory safety, and concurrency, whereas Python emphasizes ease of use and flexibility.

Installing Rust

Installing Rust

Comprehensive installation instructions for Rust can be accessed here. The installation process involves utilizing rustup, a tool used for installing both the Rust compiler (rustc) and the package manager (cargo). These tools are compatible with Linux, macOS, and Windows (WSL).

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

For alternative methods of installing Rust, refer to this page.

For this tutorial, we’ll utilize the official Rust Docker image to compile and run code within a container. To pull the image, execute:

docker pull Rust

Create an interactive Docker container using the following command:

docker run -it --rm -v $(pwd):/local_data -w /local_data Rust bash

Explanation:

  • run executes the bash command in an interactive mode (-it) to provide an interactive Bash shell for work.
  • --rm ensures the container is deleted after use.
  • -v $(pwd):/local_data mounts the current directory on the local machine to /local_data in the container.
  • -w /local_data sets the working directory to /local_data within the container.

(Free) Learning resources:

Basics of Rust

Hello World

To create a new project in Rust, utilize the cargo command:

cargo new hello

This will create a new directory called hello.

> ls -ah hello

. .. .git .gitignore Cargo.toml src

When using cargo new, a new Rust project is initialized. Alongside creating the project structure, cargo automatically sets up a new Git repository for the package and adds a Rust-specific .gitignore file

The newly created project includes a Cargo.toml file, which serves as the manifest file for the project.

This file contains details about the project, including external dependencies, package name, and versions used.

> cat hello/Cargo.toml

[package]

name = "hello"

version = "0.1.0"

edition = "2021"

# See more keys and their definitions at https://doc.Rust-lang.org/cargo/reference/manifest.html

[dependencies]

Cargo also sets up a src directory with a main.rs file containing an example program that will print “Hello, world“:

fn main() {

println!("Hello, world!");

}

In this example, the following points are illustrated:

  • Functions in Rust are defined using the fn keyword.
  • The main function designates the entry point of the code to the compiler.
  • Code blocks are enclosed within {} to denote scopes.
  • Rust includes macros (indicated by !, which will be covered later) like println! used to print the string "Hello, world!" to the screen.
  • Statements in Rust are terminated with a ; (exceptions for when to omit the ; will be explained later).

This example can be compiled using rustc:

rustc src/main.rs -o main

to create the executable main.

Alternatively we can use cargo build to compile:

> cargo build

Compiling hello v0.1.0 (/local_data/hello)

Finished dev [unoptimized + debuginfo] target(s) in 0.28s

Executing this command will compile an executable located at target/debug/hello. To run the executable, you can either call the executable directly or use the cargo run command. When using cargo run, if there are changes in the code or if the code hasn’t been compiled previously, it automatically triggers the cargo build command before executing the program.

> cargo run

Finished dev [unoptimized + debuginfo] target(s) in 0.01s

Running `target/debug/hello`

Hello, world!

The executable is typically found within a debug folder. By default, Rust generates debug information useful for code analysis and debugging. To create an optimized version for end-users, the --release flag can be utilized:

> cargo build --release

Compiling hello v0.1.0 (/local_data/hello)

Finished release [optimized] target(s) in 0.25s

This will take longer to compile as rustc is optimizing the code.

Types in Rust

In Rust, types must be known at compile time. You can explicitly specify the type of a variable using the syntax let my_variable: type = value, where the type is specified after the variable name using a :. The following example demonstrates explicit declaration of variable types on lines 3-6:

fn main() {
    // Explicitly declaring the type of the variable
    let my_integer: i32 = 42;
    let my_float: f64 = 3.14;
    let my_character: char = 'A';
    let my_boolean: bool = true;

    // Rust can infer types in many cases, so explicit annotation is not always necessary
    let inferred_integer = 10;
    let inferred_float = 2.5;
    let inferred_character = 'B';
    let inferred_boolean = false;


    // Explicitly declaring the type of the variable within the passed value
    let my_integer_in_value = 17_i8;
    let my_float_in_value = 6.28_f32;
    let my_large_unsigned_32 = 1_000_000_u32;


    // Printing the values along with their types
    println!("Integer: {} (Type: i32)", my_integer);
    println!("Float: {} (Type: f64)", my_float);
    println!("Character: {} (Type: char)", my_character);
    println!("Boolean: {} (Type: bool)", my_boolean);

    println!("Inferred Integer: {} (Type: inferred)", inferred_integer);
    println!("Inferred Float: {} (Type: inferred)", inferred_float);
    println!("Inferred Character: {} (Type: inferred)", inferred_character);
    println!("Inferred Boolean: {} (Type: inferred)", inferred_boolean);


    println!("Integer: {} (Type: inferred from value)", my_integer_in_value);
    println!("Float: {} (Type: inferred from value)", my_float_in_value);
    println!("Unsigned: {} (Type: inferred from value)", my_large_unsigned_32);
}
types.rs

The Rust compiler features type inference, enabling omission of the variable type, as it can deduce the type based on the assigned value. Internally, the compiler determines the variable’s type during compilation based on the provided value. An example illustrating this behavior is demonstrated in lines 9-12 of types.rs.

Additionally, we can explicitly specify the variable type by adding ::<type> after the assigned value. This method is showcased in lines 16-18 of types.rs.

In Rust, type conversion between different types can be achieved using keywords such as into, try_into, from, try_from, or as. Below are some examples:

fn main() {
    let integer_a: i32 = 40;
    let float_b:f32 = integer_a as f32;
    
    
    let integer_c: i32 = 3;
    // We're using "try into" here because we could have a negative integer
    let unsigned_d: u32 = integer_c.try_into().unwrap();

    let float_64_e: f64 = 6.5;
    // This wont work because going from f64->f32 loses percision and range
    // There are also some funky behaviour around inf
    // let float_32_f: f32 = f32::try_from(float_64_e).unwrap();
    let float_32_f: f32 = float_64_e as f32;

    let float_32_g:f32 = f32::from(3.13);
    let i8_h:i8 = i8::from(-3);
    let u32_i :u32 =  u32::try_from(8).unwrap();
}
Rust

In the above unwrap allows us to handle error processing. On lines 8 and 18 we are trying to convert a signed integer (a positive or negative integer) into an unsigned integer (strictly positive). This may introduce a bug as we cannot, for example, convert -1 into an unsigned integer without the loss of information. To deal with this, try_into and try_from will return a Result type. We’ll leave this for another day, but for now a Result can either return the expected value (for example the signed integer converted to an unsigned integer) or an error (for example if we try to pass -1). unwrap allows us to parse this Result type and get the converted value.

Mutability

By default values in Rust are considered “immutable”, meaning they cannot be mutated or in layman’s terms cannot be changed once defined. Consider the following example:

fn main() {

    let x = 42;
    x -= 2;
    
}
Rust

Here we define a variable x, which the Rust compiler will infer as an integer (isize, i32, i64). We then attempt to modify the variable on line 4 by subtracting 2 from the value. This will produce the following error:

...

error[E0384]: cannot assign twice to immutable variable `x`
 --> src/main.rs:4:5
  |
3 |     let x = 42;
  |         -
  |         |
  |         first assignment to `x`
  |         help: consider making this binding mutable: `mut x`
4 |     x -= 2;
  |     ^^^^^^ cannot assign twice to immutable variable

For more information about this error, try `rustc --explain E0384`.
Rust

The error message tells us that we “cannot assign twice to immutable variable x“. The compiler also suggests that “consider making this binding mutable: mut x“. Using the keyword mut we can specify that a variable be “mutable” (i.e. that it’s value can change).

fn main() {

    let mut x = 42;
    x -= 2;

}
Rust

In the above we specify that x is mutable. This allows us to change the value of x.

This might seem like a restriction when writing code, but it provides a large degree of safety when running a program. For example is we have a variable that must remain constant, we cannot accidentally change the value.

Ownership in Rust

Ownership and the borrow checker constitute the foundation of Rust’s memory management. When dealing with ownership in Rust, it’s essential to remember three fundamental rules:

  • Every value in Rust has a designated owner.
  • At any given time, there can only be a single owner for a value.
  • When the owner goes out of scope, the associated value is automatically dropped.

Let’s delve into an example to illustrate this concept:

fn main() {

    let mut x = String::from("Hello");
 
    let y = x;
    println!("{}", y);

    println!("{}", x);

}
Rust

In the provided code, a new variable x of type String is created. At line 5, a new variable y is assigned the value of x. Subsequently, attempts to print x and y on lines 6 and 8, respectively, would result in a compilation error:

error[E0382]: borrow of moved value: `x`
 --> src/main.rs:8:20
  |
3 |     let mut x = String::from("Hello");
  |         ----- move occurs because `x` has type `String`, which does not implement the `Copy` trait
4 |  
5 |     let y = x;
  |             - value moved here
...
8 |     println!("{}", x);
  |                    ^ value borrowed here after move
Rust

So what’s happening? Well on line 5 we changed the ownership of the part of the memory that holds “Hello”. The ownership of this has changed from x to y. Since we can only ever have one owner at a time, x cannot be printed. We could however run this example:

fn main() {

    let mut x = String::from("Hello");
 
    let y = x;
    println!("{}", y);
    x = y;
    println!("{}", x);

}
Rust

In the above example once we have finished using y we have passed ownership back to x.

The fact that all value in Rust only ever has one owner guarantees that we can never accidentally drop or delete a value that is still in use. This might seem very limiting and a heavy cost to pay for safety, but we can use “borrowing” to circumvent this issue.

fn main() {

    let x = String::from("Hello");
 
    let y = &x;
    println!("{}", y);

    println!("{}", x);

}
Rust

In the above value we have “borrowed” the value of x. By borrowing the values instead of taking ownership, x maintains ownership over the value, allowing different parts of the code to access the value of the x.

When borrowing values Rust’s “borrow checker” will keep track of all references and make sure that we don’t have dangling references or data races. Consider the following:

fn main() {

    let mut x = String::from("Hello");
 
    let y = &x;
    println!("{}", y);

    x += ", world";
    println!("{}", x);
    println!("{}", y);
}
Rust

In the preceding example, an immutable reference to x is established in line 5. However, an attempt to modify x occurs in line 8, resulting in a compilation error:

   Compiling tutorial v0.1.0 (/local_data)
error[E0502]: cannot borrow `x` as mutable because it is also borrowed as immutable
  --> src/main.rs:8:5
   |
5  |     let y = &x;
   |             -- immutable borrow occurs here
...
8  |     x += ", world";
   |     ^^^^^^^^^^^^^^ mutable borrow occurs here
9  |     println!("{}", x);
10 |     println!("{}", y);
   |                    - immutable borrow later used here

For more information about this error, try `rustc --explain E0502`.
Rust

What’s happening in this code? In Rust, strings occupy a fixed memory space.

When modifying a String, a new memory allocation is required since the memory size needed to store the string has changed. The += operator, used to alter the value of x, takes a “mutable” reference to x and then assigns the modified value back to x. Essentially, the += operator takes ownership of x‘s value momentarily and then returns it to the variable x.

Rust enforces a rule allowing only one mutable reference or any number of immutable references at any given time. This constraint aligns with memory safety principles: preventing a scenario where one part of the code attempts to modify a value while another part tries to read it. Such a situation could lead to a race condition, causing the code’s behavior to be undefined and reliant on the order of execution.

While it might not appear problematic for sequential code like this, attempting read and write actions across different threads could result in significant issues.

So how can we work with mutable and immutable references? Consider the following example:

fn main() {

    let mut x:i32 = 42;

    let y: &i32 = &x;
    
    println!("y = {}", y);

    let mut z: i32 = x;
    z += 1;

    println!("x = {}", x);
    println!("y = {}", y);
    println!("z = {}", z);
    

    {
        let a = y;
        let another = &x;
        println!("a = {}", a);
        println!("another = {}", another);
    }

    x += 1;
    println!("x = {}", x);

    {
        let b  = &mut x;
        *b +=  10;
        println!("b = {}", b);

    }

    println!("x = {}", x);
    let last = &mut x;
    *last -= 100;
    println!("last = {}", last);
}
Rust

In this code snippet, we perform several operations with mutable and immutable references to showcase Rust’s ownership and borrowing principles.

  • Line 3 initializes a mutable i32 assigned to variable x.
  • Line 5 creates an immutable reference y to the value of x.
  • Line 9 assigns a new value to x. This operation works because we only have a single immutable reference, y. As i32 can be copied, z receives a copy of the value of x, not the actual value.

On lines 17-22, a new scope is created. Here, we transfer ownership of reference y to a and establish a second immutable reference, another, to x. Remembering the three ownership rules (“When the owner goes out of scope, the value will be dropped”), when the scope ends at line 22, the values of a and another are dropped. Since a took ownership of y, there are now 0 immutable references. Any attempt to access y would result in an error.

On lines 27-32, a new scope introduces a mutable reference b to x. At this point, there are 0 immutable references and 1 mutable reference. Modification of the value behind x is possible by “dereferencing” b, illustrated in line 29 (*b += 10;), which adds 10 to the actual value b is referencing. When this scope ends at line 32, b is dropped, leaving 0 immutable references and 0 mutable references.

Finally, lines 35 and 36 create a mutable reference to x and modify its value.

Throughout this example, x remains the sole owner of the value, never relinquishing ownership. Borrowing the value (y, a, another, b, last) occurs at multiple stages, but x retains ownership. Although y initially held an immutable reference to x, preventing last from taking a mutable reference, ownership of the reference shifted from y to a. Upon a‘s scope exit, the immutable reference was dropped. Throughout this code, multiple immutable references or a single mutable reference were consistently present.

Understanding ownership and borrowing is the most challenging concept in Rust. Proficiency in these concepts is crucial for mastering Rust.

Summary

So far in this tutorial we’ve looked at the basics of how to write a Rust program. We’ve looked at how variables are defined and how types are important in Rust. We’ve looked at variable mutability, how we change values through our code. We’ve looked at borrowing and shown that we can have as many immutable borrows as we want or a single mutable borrow.

In the next sections we’ll start to build some functionality by looking at functional programming and flow control in Rust.

Did you find this helpful? Be sure to leave a comment below!