Type Conversion between Tensor and Other Rust Types

Tensor and Rust Type Conversions

Part of this documentation has been covered in the previous two sections (Tensor Creation and Tensor Deconstruction and Ownership). In particular, Tensor Structure and Ownership focuses on explaining RSTSR's features through code examples.

When using tensor libraries, users often need to interact with other Rust types (including Vec<T>, &[T], or other linear algebra libraries like Faer). This section systematically explains how to implement conversions between RSTSR tensors and other Rust types from a practical usage perspective.

This documentation only applies to CPU backends. Currently, other backend types have not been implemented. For future versions of RSTSR, this documentation may not apply to other backend types.

1. Conversions with `Vec<T>`

1.1 From `Vec<T>`: asarray

RSTSR's tensor Tensor<T, B, D> can be created from raw vector data, dimensions, and device information using the rt::asarray function. The rt::asarray function has multiple overloads, which will be detailed in future API documentation.

The following program stores raw data as a (2, 3)-dimensional tensor on a 16-core parallel OpenBLAS device. Note that this program behaves differently under row-major and column-major layouts:

let device = DeviceOpenBLAS::new(16);
let vec = vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let tensor = rt::asarray((vec, [2, 3], &device));
println!("{tensor:8.4}");
// output (row-major):
// [[   1.0000   2.0000   3.0000]
//  [   4.0000   5.0000   6.0000]]
// output (col-major):
// [[   1.0000   3.0000   5.0000]
//  [   2.0000   4.0000   6.0000]]

1.2 From `Vec<T>`: Manual Construction

As mentioned in the previous section, RSTSR tensors have a multi-layered structure. While the rt::asarray function is intuitive, it hides the specific process of constructing RSTSR tensors.

The following program demonstrates how RSTSR step-by-step constructs a complete tensor from the basic Vec<T> data storage unit:

use rstsr_core::prelude_dev::*;

// step 1: wrap vector into data representation
let vec = vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let data: DataOwned<Vec<f64>> = DataOwned::from(vec);

// step 2: construct device
let device: DeviceOpenBLAS = DeviceOpenBLAS::new(16);

// step 3: construct storage that pinned to device
let storage: Storage<DataOwned<Vec<f64>>, f64, DeviceOpenBLAS> = Storage::new(data, device);

// step 4: construct layout (row-major case that last stride is 1)
// this will give 2-D layout with dynamic shape
// arguments: ([nrow, ncol], [stride_row, stride_col], offset)
let layout = Layout::new(vec![2, 3], vec![3, 1], 0).unwrap();
// if you insist to use static shape, you can use:
// let layout = Layout::new([2, 3], [3, 1], 0).unwrap();

// step 5: construct tensor
let tensor = Tensor::new(storage, layout);

println!("{tensor:8.4}");
// output:
// [[   1.0000   2.0000   3.0000]
//  [   4.0000   5.0000   6.0000]]

1.3 Into `Vec<T>`: into_vec Function

For CPU backends, this function can convert a 1-D tensor into a Vec<T> vector.

Please note the following side effects:

This function prohibits converting higher-dimensional tensors (e.g., 2-D) into vectors:
```
println!("{:?}", tensor.shape());
// output: [2, 3]
let vec = tensor.into_vec();
println!("{:?}", vec);
```
If you genuinely need to convert higher-dimensional tensors into vectors, you must first use into_shape or into_contig to convert them into 1-D tensors.

For Tensor<T, B, D> (tensors that own their data), this function typically does not copy data, meaning it has almost no runtime cost:

// create tensor
let vec_raw = vec![1, 2, 3, 4, 5, 6];
let ptr_raw = vec_raw.as_ptr();
let tensor = rt::asarray(vec_raw);

// convert tensor to vector
let vec_out = tensor.into_vec();
let ptr_out = vec_out.as_ptr();
println!("{vec_out:?}");

// data is moved and no copy occurs
assert_eq!(ptr_raw, ptr_out);

However, if the offset is non-zero, stride is not one, or the underlying data length does not match the dimension information, data will still be copied:

// create tensor with stride -1
// by flip the tensor along the 0-th axis
let vec_raw = vec![1, 2, 3, 4, 5, 6];
let ptr_raw = vec_raw.as_ptr();
let tensor = rt::asarray(vec_raw).into_flip(0);
println!("{tensor:?}");
// output: [6, 5, 4, 3, 2, 1]

// convert tensor to vector
let vec_out = tensor.into_vec();
let ptr_out = vec_out.as_ptr();
println!("{vec_out:?}");
// output: [6, 5, 4, 3, 2, 1]

// data is cloned, so this `into_vec` is expensive
assert_ne!(ptr_raw, ptr_out);

For reference types (e.g., TensorView<'_, T, B, D>), this function will copy data.

1.4 Into `Vec<T>`: Top-down Deconstruction

This discussion only applies to Tensor<T, B, D> (tensors that own their data).

RSTSR tensors can be constructed from scratch using Vec<T> or deconstructed top-down from Tensor<T, B, D>. Deconstructing a tensor requires calling into_raw_parts twice and into_raw once:

// construct tensor by asarray
let device = DeviceOpenBLAS::new(16);
let vec = vec![1, 2, 3, 4, 5, 6];
let tensor = rt::asarray((vec, [2, 3], &device));

// step 1: tensor -> (storage, layout)
let (storage, layout) = tensor.into_raw_parts();
println!("{layout:?}");

// step 2: storage -> (data, device)
let (data, device) = storage.into_raw_parts();
println!("{device:?}");

// step 3: data -> raw, where DeviceOpenBLAS::Raw = Vec<T>
let vec = data.into_raw();
println!("{vec:?}");
// output: [1, 2, 3, 4, 5, 6]

However, this function also has side effects. It only returns the underlying vector used for data storage, without considering how the vector's layout is stored. For tensors of any dimension (including higher-dimensional ones like 2-D), into_raw_parts can still extract the Vec<T> data, but this data may not match what the into_vec function returns. This can be demonstrated with tensors where the stride is not 1:

let vec_raw = vec![1, 2, 3, 4, 5, 6];
let ptr_raw = vec_raw.as_ptr();
let tensor = rt::asarray(vec_raw).into_flip(0);

// step 1: tensor -> (storage, layout)
let (storage, layout) = tensor.into_raw_parts();
println!("{layout:?}");
// output:
// 1-Dim (dyn), contiguous: Custom
// shape: [6], stride: [-1], offset: 5

// step 2: storage -> (data, device)
let (data, device) = storage.into_raw_parts();
println!("{device:?}");

// step 3: data -> raw, where DeviceOpenBLAS::Raw = Vec<T>
// in this way, original `vec` will be returned
let vec_out = data.into_raw();
let ptr_out = vec_out.as_ptr();
assert_eq!(ptr_raw, ptr_out);
println!("{vec_out:?}");
// output: [1, 2, 3, 4, 5, 6]

// please note that, `tensor.into_vec` will give
// output: [6, 5, 4, 3, 2, 1]

Therefore, if you want to obtain Vec<T> through top-down deconstruction, you must ensure that the tensor or vector's layout meets your expectations.

1.5 To `Vec<T>`: to_vec Function

This function is essentially the same as into_vec, including its usage and side effects. It does not consume the input tensor but will always copy memory.

2. Conversions with `&[T]`/`&mut [T]` or Pointer Types

In Rust, &[T] (or &mut [T]) is very similar to pointer types: &[T] includes a length guarantee compared to pointers. Therefore, when you have a *const T pointer and a usize length in Rust, the approach is identical to working with &[T].

2.1 From `&[T]`: asarray

Similar to Vec<T>, RSTSR's tensor view TensorView<'_, T, B, D> can be created using the rt::asarray function. However, unlike Vec<T>, it returns a tensor view TensorView<'_, T, B, D> rather than an owning tensor Tensor<'_, T, B, D>. Note that this program behaves differently under row-major and column-major layouts:

let device = DeviceOpenBLAS::new(16);
let vec = vec![1, 2, 3, 4, 5, 6];
let tensor = rt::asarray((&vec, [2, 3], &device));
println!("{tensor}");
// output (row-major):
// [[ 1 2 3]
//  [ 4 5 6]]
// output (col-major):
// [[ 1 3 5]
//  [ 2 4 6]]

Similarly, &mut [T] can be used to create a mutable view TensorMut<'_, T, B, D> through a similar process.

2.2 From `&[T]`: Manual Construction

The approach here is consistent with manually constructing from Vec<T>. However, note that RSTSR's CPU backend always processes reference types as &Vec<T> rather than &[T]¹. Therefore, in RSTSR, we require first constructing a Vec<T> from &[T]; this vector will not be automatically dropped and will have lifetime annotations. Specifically:

use rstsr_core::prelude_dev::*;
use std::mem::ManuallyDrop;

let vec = vec![1, 2, 3, 4, 5, 6];
let vec_ref: &[usize] = &vec;

// step 1: wrap reference into data representation
// this uses `ManuallyDrop` to avoid double free
let vec_manual_drop: ManuallyDrop<Vec<usize>> = ManuallyDrop::new(unsafe {
    Vec::from_raw_parts(vec_ref.as_ptr() as *mut _, vec_ref.len(), vec_ref.len())
});
let data: DataRef<Vec<usize>> = DataRef::ManuallyDropOwned(vec_manual_drop);

// step 2: construct device
let device: DeviceOpenBLAS = DeviceOpenBLAS::new(16);

// step 3: construct storage that pinned to device
let storage: Storage<DataRef<Vec<usize>>, usize, DeviceOpenBLAS> = Storage::new(data, device);

// step 4: construct layout (row-major case that last stride is 1)
// this will give 2-D layout with dynamic shape
// arguments: ([nrow, ncol], [stride_row, stride_col], offset)
let layout = Layout::new(vec![2, 3], vec![3, 1], 0).unwrap();
// if you insist to use static shape, you can use:
// let layout = Layout::new([2, 3], [3, 1], 0).unwrap();

// step 5: construct tensor
let tensor = TensorView::new(storage, layout);

println!("{tensor}");
// output:
// [[ 1 2 3]
//  [ 4 5 6]]

Similarly, &mut [T] can be used to create a mutable view TensorMut<'_, T, B, D> through a similar process.

2.3 To `&Vec<T>`: raw Function

We can return a reference to the underlying data:

let vec = vec![1, 2, 3, 4, 5, 6];
let tensor = rt::asarray((vec, [2, 3]));
println!("{tensor}");
// output (row-major):
// [[ 1 2 3]
//  [ 4 5 6]]

let slc: &Vec<usize> = tensor.raw();
println!("{slc:?}");
// output: [1, 2, 3, 4, 5, 6]

This is essentially the same as top-down deconstruction to obtain Vec<T>, except that we only need a reference without deconstructing the tensor, so a simpler function raw can achieve this.

Similarly, for owning tensors Tensor<T, B, D> or mutable views TensorMut<'_, T, B, D>, the raw_mut function can be used to obtain a mutable reference &mut Vec<T>.

This function also has side effects.

RSTSR Does Not Verify Layout for raw-Generated &Vec<T>

This is the same as top-down deconstruction to obtain Vec<T>. RSTSR only returns a reference to the data; whether it complies with layout rules (e.g., c/f-contiguous) or whether the first element of the reference corresponds to the tensor's first element must be ensured by the user.

From this perspective, using the raw function is risky. However, since it does not involve memory safety, the function is not marked as unsafe. Users should still exercise caution when using raw.

A common mistake (even made by library developers) is failing to properly add offsets to pointers. We use the following Cholesky decomposition example to illustrate this. Suppose we have the following $3 \times 3$ f-contiguous matrix:

// prepare tensor
let vec: Vec<f64> = vec![1.0, 0.5, 2.0, 0.5, 5.0, 1.5, 2.0, 1.5, 8.0];
let device = DeviceFaer::default();
let tensor = rt::asarray((vec.clone(), [3, 3].f(), &device));
println!("{tensor:8.4}");
// output:
// [[   1.0000   0.5000   2.0000]
//  [   0.5000   5.0000   1.5000]
//  [   2.0000   1.5000   8.0000]]

If we want to perform a lower-triangular Cholesky decomposition on the bottom-right $2 \times 2$ submatrix, the standard approach in RSTSR is:

// standard way to perform Cholesky by RSTSR
let sub_mat = tensor.i((1..3, 1..3));
// [[   5.0000   1.5000]
//  [   1.5000   8.0000]]
let sub_chol = rt::linalg::cholesky((&sub_mat, Lower));
println!("{sub_chol:8.4}");
// [[   2.2361   0.0000]
//  [   0.6708   2.7477]]

Suppose we need to pass this to the lapack crate for Cholesky decomposition using other Rust types. In RSTSR, this can be done using the raw_mut function. However, without adding the correct offset to the slice generated by raw_mut, the following call is incorrect!

// wrong way to perform Cholesky by LAPACK
let mut tensor = rt::asarray((vec.clone(), [3, 3].f(), &device));
let mut sub_mat = tensor.i_mut((1..3, 1..3));
let mut info = 0;
unsafe { lapack::dpotrf(b'L', 2, sub_mat.raw_mut(), 3, &mut info) };
println!("{sub_mat:8.4}");
// This is not what we want! The `sub_mat.raw_mut()` points to 1.0 instead of 5.0!
// It actually diagonalizes tensor[0:2, 0:2] instead of tensor[1:3, 1:3]!
// [[   2.1794   1.5000]
//  [   1.5000   8.0000]]

The correct approach requires adding an offset to the pointer returned by raw_mut to ensure the pointer passed to FFI points to the first element of sub_mat:

// correct way to perform Cholesky by LAPACK
let mut tensor = rt::asarray((vec.clone(), [3, 3].f(), &device));
let mut sub_mat = tensor.i_mut((1..3, 1..3));
let offset = sub_mat.offset(); // offset is 4
let mut info = 0;
// notice that we need to add offset to the pointer
unsafe { lapack::dpotrf(b'L', 2, &mut sub_mat.raw_mut()[offset..], 3, &mut info) };
println!("{sub_mat:8.4}");
// [[   2.2361   1.5000]    upper-triangular does not matter
//  [   0.6708   2.7477]]

3. Conversions with Faer Types

Currently, RSTSR also supports conversions with a few other Rust types.

For Faer's MatRef and MatMut, RSTSR supports bidirectional conversions. Taking MatRef as an example, since it is a reference type, the process does not involve memory copying:

let vec: Vec<i32> = vec![1, 2, 3, 4, 5, 6];
let device = DeviceFaer::default();
let tensor = rt::asarray((vec, [2, 3], &device));
println!("{tensor}");
// [[ 1 2 3]
//  [ 4 5 6]]

// convert to faer tensor
use faer_ext::IntoFaer;
let faer_tensor = tensor.view().into_dim::<Ix2>().into_faer();
println!("{faer_tensor:?}");
// [
// [1, 2, 3],
// [4, 5, 6],
// ]

// convert back to rstsr tensor
use rstsr_core::tensor::ext_conversion::IntoRSTSR;
let rstsr_tensor = faer_tensor.into_rstsr();
println!("{rstsr_tensor}");
// [[ 1 2 3]
//  [ 4 5 6]]

RSTSR's storage of reference types differs from most current matrix or tensor libraries. Advanced users may find the following discussion interesting.
About the Underlying Storage of Tensor View Types
RSTSR uses a simple approach to store owning and reference types:
pub struct DataOwned<C> { pub(crate) raw: C, } pub enum DataRef<'a, C> { TrueRef(&'a C), ManuallyDropOwned(ManuallyDrop<C>), }
For CPU backends, the generic parameter C typically refers to Vec<T>. This approach makes lifetime definitions clear, as everything can be described using Vec<T>, which is very convenient for library development.
However, by definition, reference types should be &[T] rather than &Vec<T> (also refer to clippy ptr_arg). A pointer *const T, length, and lifetime can together represent a memory reference &[T]. Most matrix and tensor libraries, including ndarray, Faer, and nalgebra, define reference types this way.
It's hard to say which approach is better. However, since RSTSR's backends may not be CPU-based (e.g., data could be stored on disk or GPU), and reference types for disk or GPU may not be describable using &[T] or pointers, RSTSR currently uses &Vec<T> to represent reference types. The side effect is that when users only have &[T] without the corresponding Vec<T>, they must first convert &[T] into a Vec<T> using ManuallyDrop (to avoid double free), constructing a Vec<T> that won't be dropped, and then reference it.
The rt::asarray function encapsulates this process of converting &[T] into Vec<T> using ManuallyDrop. For common data types (e.g., f64, Complex<f64>), this typically has no impact. However, for types with destructors, users may need to be more careful when using RSTSR.
↩

Tensor and Rust Type Conversions

1. Conversions with Vec<T>​

1.1 From Vec<T>: asarray​

1.2 From Vec<T>: Manual Construction​

1.3 Into Vec<T>: into_vec Function​

1.4 Into Vec<T>: Top-down Deconstruction​

1.5 To Vec<T>: to_vec Function​

2. Conversions with &[T]/&mut [T] or Pointer Types​

2.1 From &[T]: asarray​

2.2 From &[T]: Manual Construction​

2.3 To &Vec<T>: raw Function​

3. Conversions with Faer Types​

Footnotes​