Working with complex input and output can make command line applications challenging to test, as it is inconvenient to capture the output stream to test if the program returns the correct output.
Using abstraction through Rust’s Read
and Write
traits, we can swap the input and output for byte arrays and vectors during testing instead.
Standard Streams
Standard streams are abstractions used to handle data input and output to an operating system process. Each program has access to an input stream (standard input, or stdin), an output stream (standard output, or stdout), and an error stream (standard error, or stderr) inherited from the parent process.
An example of a program that takes input from stdin, processes it and returns output through stdout is grep. The grep utility reads lines from stdin, filters those lines based on the user-supplied search pattern, and finally outputs all lines that match the pattern. When evaluated, the grep utility halts to wait for input through stdin. In this example, we start grep and pass it “th” as the search pattern:
grep th
If we type “one”, and press enter a new line gets passed to grep through stdin. Grep takes the line, and notices that it doesn’t match the search pattern, so it does nothing:
grep th one
Now, if we send a line that does match the search pattern, like “three” or “fourth”, grep will print it back through stdout. The result is slightly confusing, as stdin and stdout are mixed in the terminal, but here, the first “three” is typed manually and the second is returned by grep:
grep th one three
three
Then, like before, the program returns to waiting for input until it receives an EOF (end-of-file), which we pass by pressing ctrl+D
in the terminal.
Pipelines
Because of this abstraction, programs can use pipelines to pass the output from one program as the input to another by piping stdout from one process to stdin for another.
Here, ls prints the current directory’s contents to stdout. This example uses a pipe character to create a pipeline, to pass the output from ls as input to grep. Grep then filters to only print lines matching the passed pattern (“Cargo”).
ls | grep Cargo
Cargo.lock Cargo.toml
Stdin, Stdout and Stderr in Rust
Rust provides handles to the standard streams through the Stdin
, Stdout
and Stderr
structs, which are created with the io::stdin()
, io::stdout()
and io::stderr()
functions respectively.
This program takes input through stdin, converts the received string to uppercase and prints it back out to the terminal through stdout:
use std::io; use std::io::{Read, Write}; fn main() -> io::Result<()> { let mut buffer = "".to_string(); io::stdin().read_to_string(&mut buffer)?; io::stdout().write_all(buffer.to_uppercase().as_bytes())?; Ok(()) }
The stream handlers implement the Read
and Write
traits to read from and write to the streams.
Because of that, they share part of their implementation with other Readers and Writers, like File
.
To test the program, we can pipe the output of ls | grep Cargo
to it, which will print the file names in uppercase:
ls | grep Cargo | cargo run
CARGO.LOCK CARGO.TOML
Abstraction using the Read and Write traits
One of the issues1 in the example above is that it uses the Stdout
and Stdin
structs directly, making our program challenging to test because it is inconvenient to pass input through stdin and capture stdout to assert that the program produces the correct results.
To make our program more modular, we will decouple it from the Stdin
and Stdout
structs and pass the input and output as arguments to a more abstract, separate function.
In the test for the extracted function, we swap Stdin
and Stdout
out for other implementors of the Read
and Write
traits: a byte array for input and a vector for output.
#[cfg(test)] mod tests { use super::*; #[test] fn writes_upcased_input_to_output() { let mut output: Vec<u8> = Vec::new(); upcase(&mut "Hello, world!\n".as_bytes(), &mut output).unwrap(); assert_eq!(&output, b"HELLO, WORLD!\n"); } }
The implementation that satisfies the test looks like the original example, with one significant difference.
Because the test passes the input and output as arguments, we can use trait objects to allow any type as long as it implements the Read
and Write
traits:
use std::io::{Error, Read, Write}; pub fn upcase( input: &mut impl Read, output: &mut impl Write, ) -> Result<(), Error> { let mut buffer = "".to_string(); input.read_to_string(&mut buffer)?; output.write_all(buffer.to_uppercase().as_bytes())?; Ok(()) }
Finally, we replace the prototype in src/main.rs
with a call to our new implementation with a Stdin
and Stdout
struct for the input and output:
use std::io; fn main() -> io::Result<()> { upcase::upcase(&mut io::stdin(), &mut io::stdout()) }
By abstracting Stdin
and Stdout
out of the implementation, we made our program more modular, allowing us to test the code without resorting to capturing stdout to assert that the printed result matched our expectations.
Aside from better testability, making our implementation more modular will allow us to work with other data types in the future.
For example, we might add a command-line option that takes a filename and pass a File
to upcase()
.
Since File
also implements the Read
trait, that would work without further modifications in our implementation.
Another issue with this example is that it uses
↩︎Read::read_to_string()
, which will read the contents of the whole stream from the input before writing everything to stdout at once, which is inefficient, especially for larger inputs. A more efficient implementation could use buffered reading through theBufRead
trait to read and write the input stream line by line.