Mor Shonrai

A blog covering everything from programming, data science, astronomy and anything that pops into my head.

Rust Tutorial Part II

In post we’ll dig deeper into Rust programming, focusing on functional programming and controlling flow. For this we’ll introduce concepts such as functions, closures, if statements, match statements, for and while loops.

Functional Programming

In Rust, functional programming can be achieved through two primary methods: using functions defined with the fn keyword or leveraging closures.

Functions, declared using the fn keyword, represent a fundamental approach to functional programming in Rust. They encapsulate blocks of code that can be called multiple times with different arguments.

Closures, on the other hand, are more powerful and flexible. They are similar to functions but can capture variables from their surrounding environment. Closures allow for defining anonymous functions on the fly, making them highly adaptable for tasks requiring flexibility in behavior and data encapsulation.

Both functions and closures play integral roles in enabling functional programming paradigms within Rust, offering different levels of flexibility and usability in various scenarios.

Functions in Rust

Functions in Rust are defined using the following syntax:

fn add_numbers(a: i32, b: i32) -> i32{
    return a + b;
}

fn multiply_numbers(a :i32, b :i32) -> i32{
    a * b
}

fn print_numbers(a :i32, b :i32) {
    println!("{} + {} = {}", a,b, a+b);
}

fn print_numbers_multiply(a :i32, b :i32) -> () {
    println!("{} * {} = {}", a,b, a*b);
}

fn main(){
    let x = 3;
    let y = 4;

    let sum = add_numbers(x,y);
    print_numbers(x,y);
    let product = multiply_numbers(x,y);
    print_numbers_multiply(x,y);

    println!("sum = {}, product = {}", sum, product);

}
Rust

In the example above, three functions are defined using the fn keyword to indicate their creation. When defining functions, specifying the data types of passed arguments is necessary, as demonstrated here by using i32 types in all cases. Additionally, if a function returns a value, explicit declaration of the return type is required. Lines 1 and 5 explicitly define the return type as i32, denoted by -> T, where T represents the data type.

Lines 9 and 13 introduce functions that do not return any value. When a function doesn’t return anything, the -> can be omitted. Alternatively, it’s possible to explicitly state the absence of a return value using -> (). It is always best practice to be explicit!

The functions add_numbers and multiply_numbers both return an i32. However, only add_numbers uses a return keyword. In Rust, if a statement isn’t followed by a ;, it’s assumed to be the return value. In the case of multiply_numbers, the absence of ; specifies that the function should return a * b.

It’s important to note that in all these functions, ownership of a and b is taken within the functions. Consequently, when the function’s scope ends, both a and b are dropped. While this behavior might not be problematic for i32 due to its copy trait, allowing passing a copy of the value rather than the value itself, it’s a crucial consideration for other types where ownership might cause different behavior.

Consider the following example:

fn print_string( msg : String) -> (){
    println!("{}", msg);
}


fn main(){
    let my_string = String::from("Save Ferris!");
    print_string(my_string);
    println!("{}", my_string);
}
Rust

This will give the following error:

error[E0382]: borrow of moved value: `my_string`
 --> src/main.rs:9:20
  |
7 |     let my_string = String::from("Save Ferris!");
  |         --------- move occurs because `my_string` has type `String`, which does not implement the `Copy` trait
8 |     print_string(my_string);
  |                  --------- value moved here
9 |     println!("{}", my_string);
  |                    ^^^^^^^^^ value borrowed here after move
  |
Rust

Remember that strings have variable lengths, making direct copying non-trivial. Therefore, when print_string receives my_string, it assumes ownership. To address this, we have two solutions: either use the clone method when passing my_string to print_string, or modify print_string to borrow the string by taking a reference instead. The corrected code would appear as follows:

fn print_string( msg : String) -> (){
    println!("{}", msg);
}


fn print_string_borrow( msg : &String) -> (){
    println!("{}", msg);
}


fn main(){
    let my_string = String::from("Save Ferris!");
    print_string(my_string.clone());
    print_string_borrow(&my_string);
    println!("{}", my_string);
}
Rust

Functions can also return tuples. Consider the following:

fn get_powers( a: i32 ) -> (i32, f32){
    (a.pow(2), (a as f32).powf(0.33))
}

fn main(){
    let x :i32 = 8;
    let tup = get_powers(x);
    // Deconstruct tuple
    let (y, z) : (i32, f32) = get_powers(x);

    println!("{}, {}", tup.0, tup.1 );
    println!("{}, {}", y, z );
}
Rust

In the example above, a tuple of type (i32, f32) is returned. Line 7 stores the tuple as a variable, while on line 9, explicit deconstruction of the tuple occurs, assigning its elements to variables y and z. Accessing elements of the tuple can be achieved using tup.n to retrieve the nth element.

Closures

Closures in Rust bear similarities to lambda functions found in other programming languages. They offer a concise means to create short blocks of functionality within code. Closures, like functions, can capture and manipulate variables from their enclosing scope. They are defined using the |argument| { body } syntax, where argument represents parameters and body signifies the functionality of the closure.

An example of a closure definition:

fn main(){
    let pi  = 3.14_f32;

    let area = |x| pi * x*x;
    
    let print_area = |x| {
        println!("Area of circle with radius {} is {}", {x}, area(x));
    };

    println!("The area is: {}", area(2.));
    print_area(1.5);
}
Rust

In lines 4 and 6, two closures are defined. The area closure accepts a variable x and computes the area of a circle with radius x. This closure borrows the value of pi for the duration of its scope. On the other hand, the print_area closure accepts a variable x, prints a statement, and then passes a copy of x to the area closure.

Flow Control

Flow control is how we direct the flow of the code to different branches depending on some set of circumstances. For example we may have a code that handles customer purchases. We might want to display a special price if that customer is a member of a loyalty program, or if (else if) it is a special date corresponding to a sale on that product, otherwise (else) we display the default price. In this sections we’ll look at using if and match statements.

If Statements

Rust’s if statements follow the subsequent syntax:

let a :i32 = 4;

if a > 3{
    println!("a is greater than 3");
} else if a < 3{
    println!("a is less than 3");
} else{
    println!("a is equal to 3");
}
Rust

Note that an if block must start with an if statement and may have only one if branch and at most one else branch. However, multiple else if branches can be included as needed.

if statements are also capable of assigning variables or returning values. Let’s consider the following example:

let a :i32 = 4;

let my_string :String = if a > 3{
    "a is greater than 3".to_string()
} else if a < 3{
    "a is less than 3".to_string()
} else{
    "a is equal to 3".to_string()
};
Rust

Line 2 defines an immutable string my_string, assigned the value from this if block. In lines 4, 6, and 8, the absence of ; at the end of these lines allows them to return the String type. Finally, line 9 concludes the assignment by adding a ; at the end of the final block.

Match

match statements in Rust are akin to switch statements found in other programming languages. They enable pattern matching on variables, allowing for concise and comprehensive conditional branching.

Matching involves specifying the pattern to match against, which can either be a variable or a condition evaluation (e.g., x > 10). It commences with the keyword match and encloses different options within a set of {}. For each pattern, code branches to run are assigned using the => syntax.

fn main(){

    let a :i32 = 4;

    match a {
        0..3 => {
            println!("a is less than 3");
        },
        4..=10 => {
            println!("a is greater than 3");
        },
        3 => {
            println!("a is 3");
        },
        _ => {
            println!("a is > 10");
        },
    }

    let b = match a {
        0 => "0",
        1 => "alpha",
        2 => "2",
        3 => "delta",
        4 => "for",
        _ => "Something else",
    };

    println!("b is {}", b);
}
Rust

In lines 6 and 9, the code searches for values of x within the ranges 0-2 (< 3) and 4-10, respectively. On line 12, it checks if a equals 3. Finally, on line 15, the default case is defined using _. Each branch in this match statement executes a block of code enclosed within its scope.

In the example from line 20-27 we are returning a str based on the pattern found.

Enums

enums are a data type with a fixed number of possible values. They have a lot of uses, but it is often useful to pair an enum with a match statement. Consider the following:

enum Status{
    On,
    Off,
    Standby,
}

fn power_cycle(current: &Status) -> Status{
    match current{
        Status::On => Status::Off,
        Status::Off => Status::On,
        Status::Standby => Status::Off,
    }
}

fn main(){

    let mut power_status = Status::Standby;

    for _ in 0..10{

        match power_status {
            Status::Standby => {
                println!("Machine on standby...Cycling Power");
            },
            Status::On => {
                println!("Machine on...Cycling Power");
            },
            Status::Off => {
                println!("Machine off...Cycling Power");
            }
        }

        power_status = power_cycle(&power_status);
    }    
}
Rust

In the above example, we defined an enum called Status with 3 options (On, Off and Standby). We can imagine this being some device we want to interface with. On lines 7-13 we defined a function power_cycle which takes in the current Status and returns a new Status. We could imagine this being a function to cycle the power on the device we’re interfacing with. On line 21 in the main function, we are match-ing on the enum. On line 33 we’re calling the power_cycle function to switch what the power status is. We’re using the fixed number of options available to control the flow of the program. This gives the following output:

Machine on standby...Cycling Power
Machine off...Cycling Power
Machine on...Cycling Power
Machine off...Cycling Power
Machine on...Cycling Power
Machine off...Cycling Power
Machine on...Cycling Power
Machine off...Cycling Power
Machine on...Cycling Power
Machine off...Cycling Power
Rust

Using an enum is a great option when dealing with a fixed set of possible outcomes or options. For example, we could have an enum of colors when trying to make a plot. The function to perform the plotting could take in the color enum then use a match statement to set the color of the points.

Loops

Loops in Rust are straightforward and flexible. The loop keyword initiates an infinite loop, allowing code to execute repeatedly within a defined scope until explicitly interrupted by a break statement.

For instance:

fn main(){
    let mut i = 0;

    loop {
        i+=1;
        if i == 3{
            continue;
        } else if i > 10{
            break;
        } else{
            println!("i = {}", i);
        }
    }
}
Rust

Lines 4-12 constitute the content wrapped within the loop block, as indicated on line 4. At line 7, a continue statement is employed to skip the iteration where i equals 3. Moreover, line 9 utilizes a break statement to exit the loop when the condition i > 10 is met.

In Rust, it is possible to assign labels to loops to facilitate continue or break operations targeting a specific loop. This is achieved using the 'name: loop {} syntax:

fn main(){
    let mut i = 0;

    'astra : loop {
        let mut j = 0;

        'kafka : loop{
            if i > 10{
                break 'astra;
            } else if j > 3{
                break 'kafka;
            } else{
                println!("i,j = {},{}", i,j);
            }
            j+=1;
        }
        i+=1;
    }
}
Rust

In the provided example, we establish a parent loop named 'astra, encompassing the scope from line 4 to line 18. Within 'astra, we define a nested loop named 'kafka, spanning lines 7 to 16.

At line 8, a break statement exits the 'astra loop if i > 10. Furthermore, line 11 employs a break statement to exit the 'kafka loop if j > 3.

The output of this code will be:

i,j = 0,0
i,j = 0,1
i,j = 0,2
i,j = 0,3
i,j = 1,0
...
i,j = 10,2
i,j = 10,3
Rust

For Loops

For loops in Rust operate on any data that conforms to an iterator. This includes constructs such as for element in list or for i in a range. The syntax used for these loops is as follows:

fn main(){
    let n:i32 = 10;

    for i in 0..n{
        println!("i = {}", i);
    }
}
Rust

In this context, we define a range 0..n, representing the inclusive range from 0 to 9 (Alternatively, we could use 0..=9).

When dealing with an array or vector of items, we can iterate over them as follows:

fn main(){
    let my_arr: [f32;5] = [1.,2.,3.,43., 3.14];
    for a in my_arr{
        println!("{}",a);
    }
    println!("{:?}", my_arr);
}
Rust

In the above example, a stores a copy of the values from my_arr rather than a reference to those values. Modifying a will not alter my_arr. However, the behavior slightly differs when working with vectors.

fn main(){
    let my_arr: Vec<f32> = vec![1.,2.,3.,43., 3.14];
    for a in my_arr{
        println!("{}",a);
    }

    println!("{:?}", my_arr);
}
Rust

The above example will return an error on line 7.


   --> src/main.rs:7:22
    |
2   |     let my_arr: Vec<f32> = vec![1.,2.,3.,43., 3.14];
    |         ------ move occurs because `my_arr` has type `Vec<f32>`, which does not implement the `Copy` trait
3   |     for a in my_arr{
    |              ------ `my_arr` moved due to this implicit call to `.into_iter()`
...
7   |     println!("{:?}", my_arr);
    |                      ^^^^^^ value borrowed here after move
    |
Rust

The error indicates that Vec<f32> doesn’t implement the Copy trait (more on traits in part 3!). Consequently, when attempting to iterate over its values, Rust borrows the values rather than making copies. As a result, the ownership of these values is temporarily transferred into the for loop’s scope at line 5. However, as the loop ends, these borrowed values are automatically dropped, as their ownership wasn’t transferred back outside the loop.

Looking further at the compile output we see:

help: consider iterating over a slice of the `Vec<f32>`'s content to avoid moving into the `for` loop
    |
3   |     for a in &my_arr{
    |              +

For more information about this error, try `rustc --explain E0382`.
Rust

Here we see some of the awesome features of the Rust compiler. It is smart enough to understand what we are trying to do and suggest a fix to the code. The fixed code would look like:

fn main(){
    let my_arr: Vec<f32> = vec![1.,2.,3.,43., 3.14];
    for a in &my_arr{
        println!("{}",a);
    }

    println!("{:?}", my_arr);
}
Rust

At line 3, we’re iterating over a reference to a slice of the vector. In this instance, the vector slice represents the entire range of the vector.

We can iterate over tuples to access and combine their values:

fn main(){
    let x: Vec<f32> = vec![1.,2.,3.,43., 3.14];
    let y: Vec<f32> = vec![2.,0.1,5.3,0.001, 3.14];
    let mut z: Vec<f32> = vec![0.0_f32; 5];

    for ((a, b), i) in x.iter().zip(&y).zip(0..x.len()){
        println!("{},{}",a, b);
        z[i] = a + b;
    }

    println!("{:?}", z);
  }
vector for loop

In the provided example, there are three vectors: x, y, and z, where z is a mutable vector.

At line 6, iter is utilized to obtain an iterable reference to x. Subsequently, it is zip-ped with a reference to y, invoking the into_iter method for y (similar to the vector for loop example). This action results in a tuple of type (&f32, &f32). Additionally, another zip operation is performed with the range 0..x.len(), effectively creating a loop over a tuple of ((&f32, &f32), usize).

Within this loop, values are assigned to z.

Alternatively we could of used enumerate() instead of zip(0..x.len()):

fn main(){
    let x: Vec<f32> = vec![1.,2.,3.,43., 3.14];
    let y: Vec<f32> = vec![2.,0.1,5.3,0.001, 3.14];
    let mut z: Vec<f32> = vec![0.0_f32; 5];

    for (i, (a, b)) in x.iter().zip(&y).enumerate(){
        println!("{},{}",a, b);
        z[i] = a + b;
    }

    println!("{:?}", z);
}
Assigning Values in a Loop

Here we are forming the tuple of elements from x and y, with x.iter().zip(&y) giving outputting an iterator of (&f32, &f32). Calling enumerate() on that iterator gives us a new tuple of (usize, (&f32, &f32)), where the usize is the index of the current iteration. enumerate is useful when we want to loop over some iterator and also utilize the current index. In the above example we used the current index to fill the vector z.

Aside on iter vs into_iter:

In the vector for loop example, we employed the for a in my_arr syntax, which implicitly calls the into_iter method. The into_iter method, being a generic method, returns either a copy, a reference, or the value itself. On the other hand, the iter method explicitly returns a reference.

If distinguishing between the two seems perplexing, consider into_iter as moving the value “into” the scope. If ownership needs to be maintained, it’s advisable to use iter. Conversely, if the value can be consumed by the scope, into_iter is preferable.

For a more detailed explanation, refer to this Stack Overflow question.

Looping the Rust way

In the Assigning Values in a Loop example, we explored how to derive values from two vectors to assign to a third vector. However, this approach isn’t considered very idiomatic in Rust. A more idiomatic way to achieve this would be:

fn main(){
    let x: Vec<f32> = vec![1.,2.,3.,43., 3.14];
    let y: Vec<f32> = vec![2.,0.1,5.3,0.001, 3.14];

    let z  = x.iter().zip(&y).map(|(a,b)| a + b).collect::<Vec<f32>>();
    println!("{:?}", z);
}
Idiomatic Rust For Loop

In this example, we condense the entire loop into a single line of code. Starting with x.iter(), we iterate over references to the values within x. Using the zip function with a reference to y facilitates the iteration over a tuple of type (&f32, &f32).

Each tuple undergoes processing within a closure passed to the map method. This closure deconstructs the tuple into two values and adds them together. The collect() method accumulates the values returned by the closure used in the map method.

The “Turbofish” syntax, collect::<type>(), informs collect about the desired return type. In this instance, using collect::<Vec<f32>>(), we obtain a Vec<f32>.

Using a reduction, as shown in the Idiomatic Rust For Loop example, is a powerful tool. For instance, suppose we aim to extract all even values from a vector, we could employ:

fn main() {
    let values = (0..100).collect::<Vec<i32>>();

    let even_squared = values.iter()
        .filter(|&x| x % 2 == 0)
        .map(|x| x * x)
        .collect::<Vec<i32>>();


    let odd_sum = values.iter()
        .filter(|&x| x % 2 == 1)
        .sum::<i32>();

    println!("{:?}", even_squared);
    println!("Sum of the odd values = {}", odd_sum);
}
Rust

In the given code, values is a range from 0 to 99 inclusive, which we have collect-ed as a vector of integers (i32). The operations on this range illustrate various methods provided by Rust’s Iterator trait.

Starting from line 4, even_squared is created by iterating over values. The iter method is used here to avoid consuming the original vector, enabling separate iteration over the original range (values) on lines 4 and 10. The filter method is then applied to this cloned range, utilizing a closure (|x| x % 2 == 0) to test if each element is even by performing a modulo operation and checking if the remainder is zero. Elements satisfying this condition are retained, while those failing the test are discarded. The subsequent map method takes the retained even numbers, squares each value by multiplying it by itself (x * x), and produces a transformed iterator. Finally, the collect method is used to gather the squared even numbers into a Vec<i32>.

Next, between lines 10 and 12, odd_sum is calculated using a similar approach. Here, the filter method is again used on values.iter() (which will be immutable references to the elements in values), but this time with a closure (|x| x % 2 == 1) that filters for odd numbers. The sum method is applied to this filtered iterator to compute the sum of the odd numbers present in the range.

The code concludes by displaying the vector containing squared even numbers (even_squared) and printing the sum of the odd numbers (odd_sum).

Summary

In this tutorial we learned what we need to start writing more complicated Rust code. We can now utilize various forms of flow control such as if and match statements, and infinite loops and for loops. We learned how to write functions in Rust either using the fn keyword or as quicker closures.

We also looked at how to write functions and loops in a more Rustic way, showcasing the power of Rust in writing pipelines in a human-readable way.

In the next tutorial we’ll dive into structs, generic types and traits. This will transition us towards object orientated programming. Utilizing generic types and traits, we’ll see how Rust approaches polymorphism in what I believe to be one of Rust’s most powerful features!