Type Conversion between Tensor and Other Rust Types
Tensor and Rust Type Conversions
Part of this documentation has been covered in the previous two sections (Tensor Creation and Tensor Deconstruction and Ownership). In particular, Tensor Structure and Ownership focuses on explaining RSTSR's features through code examples.
When using tensor libraries, users often need to interact with other Rust types (including Vec<T>
, &[T]
, or other linear algebra libraries like Faer). This section systematically explains how to implement conversions between RSTSR tensors and other Rust types from a practical usage perspective.
This documentation only applies to CPU backends. Currently, other backend types have not been implemented. For future versions of RSTSR, this documentation may not apply to other backend types.
1. Conversions with Vec<T>
1.1 From Vec<T>
: asarray
RSTSR's tensor Tensor<T, B, D>
can be created from raw vector data, dimensions, and device information using the rt::asarray
function. The rt::asarray
function has multiple overloads, which will be detailed in future API documentation.
The following program stores raw data as a (2, 3)-dimensional tensor on a 16-core parallel OpenBLAS device. Note that this program behaves differently under row-major and column-major layouts:
let device = DeviceOpenBLAS::new(16);
let vec = vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let tensor = rt::asarray((vec, [2, 3], &device));
println!("{tensor:8.4}");
// output (row-major):
// [[ 1.0000 2.0000 3.0000]
// [ 4.0000 5.0000 6.0000]]
// output (col-major):
// [[ 1.0000 3.0000 5.0000]
// [ 2.0000 4.0000 6.0000]]
1.2 From Vec<T>
: Manual Construction
As mentioned in the previous section, RSTSR tensors have a multi-layered structure. While the rt::asarray
function is intuitive, it hides the specific process of constructing RSTSR tensors.
The following program demonstrates how RSTSR step-by-step constructs a complete tensor from the basic Vec<T>
data storage unit:
use rstsr_core::prelude_dev::*;
// step 1: wrap vector into data representation
let vec = vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0];
let data: DataOwned<Vec<f64>> = DataOwned::from(vec);
// step 2: construct device
let device: DeviceOpenBLAS = DeviceOpenBLAS::new(16);
// step 3: construct storage that pinned to device
let storage: Storage<DataOwned<Vec<f64>>, f64, DeviceOpenBLAS> = Storage::new(data, device);
// step 4: construct layout (row-major case that last stride is 1)
// this will give 2-D layout with dynamic shape
// arguments: ([nrow, ncol], [stride_row, stride_col], offset)
let layout = Layout::new(vec![2, 3], vec![3, 1], 0).unwrap();
// if you insist to use static shape, you can use:
// let layout = Layout::new([2, 3], [3, 1], 0).unwrap();
// step 5: construct tensor
let tensor = Tensor::new(storage, layout);
println!("{tensor:8.4}");
// output:
// [[ 1.0000 2.0000 3.0000]
// [ 4.0000 5.0000 6.0000]]
1.3 Into Vec<T>
: into_vec Function
For CPU backends, this function can convert a 1-D tensor into a Vec<T>
vector.
Please note the following side effects:
-
This function prohibits converting higher-dimensional tensors (e.g., 2-D) into vectors:
If you genuinely need to convert higher-dimensional tensors into vectors, you must first use
into_shape
orinto_contig
to convert them into 1-D tensors. -
For
Tensor<T, B, D>
(tensors that own their data), this function typically does not copy data, meaning it has almost no runtime cost:// create tensor
let vec_raw = vec![1, 2, 3, 4, 5, 6];
let ptr_raw = vec_raw.as_ptr();
let tensor = rt::asarray(vec_raw);
// convert tensor to vector
let vec_out = tensor.into_vec();
let ptr_out = vec_out.as_ptr();
println!("{vec_out:?}");
// data is moved and no copy occurs
assert_eq!(ptr_raw, ptr_out);However, if the offset is non-zero, stride is not one, or the underlying data length does not match the dimension information, data will still be copied:
// create tensor with stride -1
// by flip the tensor along the 0-th axis
let vec_raw = vec![1, 2, 3, 4, 5, 6];
let ptr_raw = vec_raw.as_ptr();
let tensor = rt::asarray(vec_raw).into_flip(0);
println!("{tensor:?}");
// output: [6, 5, 4, 3, 2, 1]
// convert tensor to vector
let vec_out = tensor.into_vec();
let ptr_out = vec_out.as_ptr();
println!("{vec_out:?}");
// output: [6, 5, 4, 3, 2, 1]
// data is cloned, so this `into_vec` is expensive
assert_ne!(ptr_raw, ptr_out); -
For reference types (e.g.,
TensorView<'_, T, B, D>
), this function will copy data.
1.4 Into Vec<T>
: Top-down Deconstruction
This discussion only applies to Tensor<T, B, D>
(tensors that own their data).
RSTSR tensors can be constructed from scratch using Vec<T>
or deconstructed top-down from Tensor<T, B, D>
. Deconstructing a tensor requires calling into_raw_parts
twice and into_raw
once:
// construct tensor by asarray
let device = DeviceOpenBLAS::new(16);
let vec = vec![1, 2, 3, 4, 5, 6];
let tensor = rt::asarray((vec, [2, 3], &device));
// step 1: tensor -> (storage, layout)
let (storage, layout) = tensor.into_raw_parts();
println!("{layout:?}");
// step 2: storage -> (data, device)
let (data, device) = storage.into_raw_parts();
println!("{device:?}");
// step 3: data -> raw, where DeviceOpenBLAS::Raw = Vec<T>
let vec = data.into_raw();
println!("{vec:?}");
// output: [1, 2, 3, 4, 5, 6]
However, this function also has side effects. It only returns the underlying vector used for data storage, without considering how the vector's layout is stored. For tensors of any dimension (including higher-dimensional ones like 2-D), into_raw_parts
can still extract the Vec<T>
data, but this data may not match what the into_vec
function returns. This can be demonstrated with tensors where the stride is not 1:
let vec_raw = vec![1, 2, 3, 4, 5, 6];
let ptr_raw = vec_raw.as_ptr();
let tensor = rt::asarray(vec_raw).into_flip(0);
// step 1: tensor -> (storage, layout)
let (storage, layout) = tensor.into_raw_parts();
println!("{layout:?}");
// output:
// 1-Dim (dyn), contiguous: Custom
// shape: [6], stride: [-1], offset: 5
// step 2: storage -> (data, device)
let (data, device) = storage.into_raw_parts();
println!("{device:?}");
// step 3: data -> raw, where DeviceOpenBLAS::Raw = Vec<T>
// in this way, original `vec` will be returned
let vec_out = data.into_raw();
let ptr_out = vec_out.as_ptr();
assert_eq!(ptr_raw, ptr_out);
println!("{vec_out:?}");
// output: [1, 2, 3, 4, 5, 6]
// please note that, `tensor.into_vec` will give
// output: [6, 5, 4, 3, 2, 1]
Therefore, if you want to obtain Vec<T>
through top-down deconstruction, you must ensure that the tensor or vector's layout meets your expectations.
1.5 To Vec<T>
: to_vec Function
This function is essentially the same as into_vec
, including its usage and side effects. It does not consume the input tensor but will always copy memory.
2. Conversions with &[T]
/&mut [T]
or Pointer Types
In Rust, &[T]
(or &mut [T]
) is very similar to pointer types: &[T]
includes a length guarantee compared to pointers. Therefore, when you have a *const T
pointer and a usize
length in Rust, the approach is identical to working with &[T]
.
2.1 From &[T]
: asarray
Similar to Vec<T>
, RSTSR's tensor view TensorView<'_, T, B, D>
can be created using the rt::asarray
function. However, unlike Vec<T>
, it returns a tensor view TensorView<'_, T, B, D>
rather than an owning tensor Tensor<'_, T, B, D>
. Note that this program behaves differently under row-major and column-major layouts:
let device = DeviceOpenBLAS::new(16);
let vec = vec![1, 2, 3, 4, 5, 6];
let tensor = rt::asarray((&vec, [2, 3], &device));
println!("{tensor}");
// output (row-major):
// [[ 1 2 3]
// [ 4 5 6]]
// output (col-major):
// [[ 1 3 5]
// [ 2 4 6]]
Similarly, &mut [T]
can be used to create a mutable view TensorMut<'_, T, B, D>
through a similar process.
2.2 From &[T]
: Manual Construction
The approach here is consistent with manually constructing from Vec<T>
. However, note that RSTSR's CPU backend always processes reference types as &Vec<T>
rather than &[T]
1. Therefore, in RSTSR, we require first constructing a Vec<T>
from &[T]
; this vector will not be automatically dropped and will have lifetime annotations. Specifically:
use rstsr_core::prelude_dev::*;
use std::mem::ManuallyDrop;
let vec = vec![1, 2, 3, 4, 5, 6];
let vec_ref: &[usize] = &vec;
// step 1: wrap reference into data representation
// this uses `ManuallyDrop` to avoid double free
let vec_manual_drop: ManuallyDrop<Vec<usize>> = ManuallyDrop::new(unsafe {
Vec::from_raw_parts(vec_ref.as_ptr() as *mut _, vec_ref.len(), vec_ref.len())
});
let data: DataRef<Vec<usize>> = DataRef::ManuallyDropOwned(vec_manual_drop);
// step 2: construct device
let device: DeviceOpenBLAS = DeviceOpenBLAS::new(16);
// step 3: construct storage that pinned to device
let storage: Storage<DataRef<Vec<usize>>, usize, DeviceOpenBLAS> = Storage::new(data, device);
// step 4: construct layout (row-major case that last stride is 1)
// this will give 2-D layout with dynamic shape
// arguments: ([nrow, ncol], [stride_row, stride_col], offset)
let layout = Layout::new(vec![2, 3], vec![3, 1], 0).unwrap();
// if you insist to use static shape, you can use:
// let layout = Layout::new([2, 3], [3, 1], 0).unwrap();
// step 5: construct tensor
let tensor = TensorView::new(storage, layout);
println!("{tensor}");
// output:
// [[ 1 2 3]
// [ 4 5 6]]
Similarly, &mut [T]
can be used to create a mutable view TensorMut<'_, T, B, D>
through a similar process.
2.3 To &Vec<T>
: raw Function
We can return a reference to the underlying data:
let vec = vec![1, 2, 3, 4, 5, 6];
let tensor = rt::asarray((vec, [2, 3]));
println!("{tensor}");
// output (row-major):
// [[ 1 2 3]
// [ 4 5 6]]
let slc: &Vec<usize> = tensor.raw();
println!("{slc:?}");
// output: [1, 2, 3, 4, 5, 6]
This is essentially the same as top-down deconstruction to obtain Vec<T>
, except that we only need a reference without deconstructing the tensor, so a simpler function raw
can achieve this.
Similarly, for owning tensors Tensor<T, B, D>
or mutable views TensorMut<'_, T, B, D>
, the raw_mut
function can be used to obtain a mutable reference &mut Vec<T>
.
This function also has side effects.
raw
-Generated &Vec<T>
This is the same as top-down deconstruction to obtain Vec<T>
. RSTSR only returns a reference to the data; whether it complies with layout rules (e.g., c/f-contiguous) or whether the first element of the reference corresponds to the tensor's first element must be ensured by the user.
From this perspective, using the raw
function is risky. However, since it does not involve memory safety, the function is not marked as unsafe. Users should still exercise caution when using raw
.
A common mistake (even made by library developers) is failing to properly add offsets to pointers. We use the following Cholesky decomposition example to illustrate this. Suppose we have the following f-contiguous matrix:
// prepare tensor
let vec: Vec<f64> = vec![1.0, 0.5, 2.0, 0.5, 5.0, 1.5, 2.0, 1.5, 8.0];
let device = DeviceFaer::default();
let tensor = rt::asarray((vec.clone(), [3, 3].f(), &device));
println!("{tensor:8.4}");
// output:
// [[ 1.0000 0.5000 2.0000]
// [ 0.5000 5.0000 1.5000]
// [ 2.0000 1.5000 8.0000]]
If we want to perform a lower-triangular Cholesky decomposition on the bottom-right submatrix, the standard approach in RSTSR is:
// standard way to perform Cholesky by RSTSR
let sub_mat = tensor.i((1..3, 1..3));
// [[ 5.0000 1.5000]
// [ 1.5000 8.0000]]
let sub_chol = rt::linalg::cholesky((&sub_mat, Lower));
println!("{sub_chol:8.4}");
// [[ 2.2361 0.0000]
// [ 0.6708 2.7477]]
Suppose we need to pass this to the lapack
crate for Cholesky decomposition using other Rust types. In RSTSR, this can be done using the raw_mut
function. However, without adding the correct offset to the slice generated by raw_mut
, the following call is incorrect!
The correct approach requires adding an offset to the pointer returned by raw_mut
to ensure the pointer passed to FFI points to the first element of sub_mat
:
// correct way to perform Cholesky by LAPACK
let mut tensor = rt::asarray((vec.clone(), [3, 3].f(), &device));
let mut sub_mat = tensor.i_mut((1..3, 1..3));
let offset = sub_mat.offset(); // offset is 4
let mut info = 0;
// notice that we need to add offset to the pointer
unsafe { lapack::dpotrf(b'L', 2, &mut sub_mat.raw_mut()[offset..], 3, &mut info) };
println!("{sub_mat:8.4}");
// [[ 2.2361 1.5000] upper-triangular does not matter
// [ 0.6708 2.7477]]
3. Conversions with Faer Types
Currently, RSTSR also supports conversions with a few other Rust types.
For Faer's MatRef
and MatMut
, RSTSR supports bidirectional conversions. Taking MatRef
as an example, since it is a reference type, the process does not involve memory copying:
let vec: Vec<i32> = vec![1, 2, 3, 4, 5, 6];
let device = DeviceFaer::default();
let tensor = rt::asarray((vec, [2, 3], &device));
println!("{tensor}");
// [[ 1 2 3]
// [ 4 5 6]]
// convert to faer tensor
use faer_ext::IntoFaer;
let faer_tensor = tensor.view().into_dim::<Ix2>().into_faer();
println!("{faer_tensor:?}");
// [
// [1, 2, 3],
// [4, 5, 6],
// ]
// convert back to rstsr tensor
use rstsr_core::tensor::ext_conversion::IntoRSTSR;
let rstsr_tensor = faer_tensor.into_rstsr();
println!("{rstsr_tensor}");
// [[ 1 2 3]
// [ 4 5 6]]
Footnotes
-
RSTSR's storage of reference types differs from most current matrix or tensor libraries. Advanced users may find the following discussion interesting.
↩About the Underlying Storage of Tensor View TypesRSTSR uses a simple approach to store owning and reference types:
pub struct DataOwned<C> {
pub(crate) raw: C,
}
pub enum DataRef<'a, C> {
TrueRef(&'a C),
ManuallyDropOwned(ManuallyDrop<C>),
}For CPU backends, the generic parameter
C
typically refers toVec<T>
. This approach makes lifetime definitions clear, as everything can be described usingVec<T>
, which is very convenient for library development.However, by definition, reference types should be
&[T]
rather than&Vec<T>
(also refer to clippyptr_arg
). A pointer*const T
, length, and lifetime can together represent a memory reference&[T]
. Most matrix and tensor libraries, including ndarray, Faer, and nalgebra, define reference types this way.It's hard to say which approach is better. However, since RSTSR's backends may not be CPU-based (e.g., data could be stored on disk or GPU), and reference types for disk or GPU may not be describable using
&[T]
or pointers, RSTSR currently uses&Vec<T>
to represent reference types. The side effect is that when users only have&[T]
without the correspondingVec<T>
, they must first convert&[T]
into aVec<T>
using ManuallyDrop (to avoid double free), constructing aVec<T>
that won't be dropped, and then reference it.The
rt::asarray
function encapsulates this process of converting&[T]
intoVec<T>
using ManuallyDrop. For common data types (e.g.,f64
,Complex<f64>
), this typically has no impact. However, for types with destructors, users may need to be more careful when using RSTSR.