Autodocs
-
template<class Op = danceq::internal::Operator<danceq::internal::BasisU1<danceq::internal::ContainerTable<danceq::internal::State<128, 2>>>, double, danceq::Hamiltonian>>
class ShellMatrix Given an Operator, the ShellMatrix class provides a matrix-free matrix-vector multiplication.
Template parameters
class T: Operator class.
Default: danceq:: Operator <danceq:: BasisU1 <danceq:: ContainerTable <danceq:: State <128,2,uint64_t>>>, double>
Things to know
The class supports distributed (MPI) and shared (openMP) memory.
The syntax for each parallelization option is equivalent.
The ShellMatrix should be returned from the Operator class via create_ShellMatrix(…).
All member functions are
const
can not be modified after creation.The code works with the Vector class which distributes memory across all processes if MPI is used. Use the function create_Vector() to ensure the correct memory layout.
The following applies only if MPI is utilized and the Vector is distributed across multiple processes. The full communication pattern is analyzed when the ShellMatrix is created. Therefore, additional memory used during the multiplication process to communicate between different processes is only allocated once. The memory overhead due to the communication can be limited during the construction.
The debugging level can be adjusted from 0 (no debugging) to 10 (maximal debugging):
#define dbug_level 10; // 10 maximal debugging, 0 is no debugging
Warning
The system size L has to be smaller or equal to the maximal number of sites MaxSites of the underlying State class.
Public Types
-
using state_class = typename Op::state_class
State class.
State class can be retrieved from the ShellMatrix class:
// This equivalent to: State state; ShellMatrix::state_class state;
-
using scalartype = typename Op::scalartype
Scalar.
The underlying scalar defined from the Operator class can be retrieved from the ShellMatrix class:
ShellMatrix::scalartype A;
-
using scalartype_real = decltype(std::real(std::declval<scalartype>()))
Real part of the scalar.
The real part of the underlying scalar defined from the Operator class can be retrieved from the ShellMatrix class:
ShellMatrix::scalartype_real A;
This is equivalent to scalartype if it is real. If scalartype is complex, e.g.,
std::complex<double>
, scalartype_real isdouble
.
Public Functions
-
int32_t apply(const danceq::internal::Vector<scalartype> &input, danceq::internal::Vector<scalartype> &output, const bool reset_output_to_zero = true) const
High-level function to perform a matrix-free multiplication using the Vector class.
The function is a wrapper for the low-level function MATMUL_basic(…). In- and output are instances of the Vector class that should be created with create_Vector(). Before the multiplication is executed, the output data might by set to zero using the
boolean
reset_output_to_zero.- Parameters:
input – Input Vector
output – Output Vector
reset_output_to_zero – Boolean to set output to zero before multiplication
- Returns:
error_code
-
Vector<scalartype> operator*(const Vector<scalartype> &v) const
High-level function to perform a matrix-free multiplication using the Vector class.
The function is a wrapper for the low-level function MATMUL_basic(…). Input is an instance of the Vector class that should be created with create_Vector().
The usage is straightforward:
w = H*v;
- Parameters:
v – Input Vector
- Returns:
Output vector
-
scalartype get_expectation_value(const danceq::internal::Vector<scalartype> &lhs, const danceq::internal::Vector<scalartype> &rhs) const
High-level function to compute the expectation value using the Vector class.
The function is a wrapper for the low-level function EXPECTATION_VALUE_basic(…). lhs (rhs) refers to the left(right)-hand side. Both are instances of the Vector class that should be created with create_Vector(). The complex conjugate of the left-hand side is used when the scalars are complex.
- Parameters:
lhs – Left-hand side
rhs – Reft-hand side
- Returns:
Expectation value
-
scalartype get_expectation_value(const danceq::internal::Vector<scalartype> &psi) const
High-level function to compute the expectation value using the Vector class.
The function is a wrapper for get_expectation_value(psi,psi) which calls the low-level function EXPECTATION_VALUE_basic(…). The input is an instance of the Vector class that should be created with create_Vector(). The complex conjugate is used for the left-hand side.
- Parameters:
psi – Left- and right-hand side
- Returns:
Expectation value
-
Vector<scalartype> create_Vector(void) const
Creates a Vector class instance compatible with the ShellMatrix.
The function creates an instance of the Vector class that can be used with the computations. This is particularly important when MPI is enabled to ensure the correct memory layout. The Vector class has many additionally features for simple linear algebra routines.
- Returns:
Vector
-
int32_t info(const bool print_output = true, const std::string pre = "") const
Prints info and checks data.
- Parameters:
print_output – Prints output if
true
pre – String printed in front of the output
- Returns:
error_code
-
uint64_t get_dim(void) const
Returns dim.
- Returns:
dim
-
uint64_t get_mydim(void) const
Returns the local dimension.
This is only available with MPI.
- Returns:
mydim
-
uint64_t get_start(void) const
Returns the beginning of the rank’s part.
This is only available with MPI.
- Returns:
start
-
uint64_t get_end(void) const
Returns the end of the rank’s part.
This is only available with MPI.
- Returns:
end
Protected Functions
-
ShellMatrix(const Op *operator_ptr_, const int32_t myrank_, const int32_t world_size_, const uint64_t start_, const uint64_t end_, const uint64_t number_of_communication_steps_, const uint64_t maximal_number_of_elements_to_recv_, const uint64_t maximal_number_of_elements_to_send_, const std::vector<uint64_t> dim_per_step_, const std::vector<uint64_t> total_number_of_elements_to_recv_per_step_, const std::vector<uint64_t> total_number_of_elements_to_send_per_step_, const std::vector<uint64_t> ownership_per_rank_, const std::vector<std::vector<uint64_t>> number_of_elements_to_send_per_rank_per_step_, const std::vector<std::vector<uint64_t>> number_of_elements_to_recv_per_rank_per_step_)
Constructor used by the Operator class.
The full communication pattern was generated by the Operator class when calling create_ShellMatrix(…). The memory overhead can be limited by overhead_in_GB_per_core.
The members as listed here.
- Parameters:
operator_ptr_ – Sets Pointer to the Operator
myrank_ – Sets myrank
world_size_ – Sets world_size
start_ – Sets start
end_ – Sets end
number_of_communication_steps_ – Sets number_of_communication_steps
maximal_number_of_elements_to_recv_ – Sets maximal_number_of_elements_to_recv
maximal_number_of_elements_to_send_ – Sets maximal_number_of_elements_to_send
dim_per_step_ – Sets dim_per_step
total_number_of_elements_to_recv_per_step_ – Sets total_number_of_elements_to_recv_per_step
total_number_of_elements_to_send_per_step_ – Sets total_number_of_elements_to_send_per_step
ownership_per_rank_ – Sets ownership_per_rank
number_of_elements_to_send_per_rank_per_step_ – Sets number_of_elements_to_send_per_rank_per_step
number_of_elements_to_recv_per_rank_per_step_ – Sets number_of_elements_to_send_per_rank_per_step
Private Functions
-
int32_t MATMUL_basic(const scalartype *input, scalartype *output, const bool reset_output_to_zero = true) const
Low-level function to preform the matrix-free matrix-vector multiplication.
This function is the core the ShellMatrix class which preforms the matrix-free multiplication. It takes an input and output pointing to the correctly assigned data. For example, they can point to the data of the Vector class or the data of the internal vector structure Vec of Petsc. Note, that if the data is distributed over different processes with MPI it only points to the local data. The input is marked by
const
and is, therefore, not modified. Before the multiplication is executed, the output data might by set to zero using theboolean
reset_output_to_zero. Not setting the output to zero allows an efficient construction of the Krylov space methods where only two (instead of three) vectors are stored. The function has four different implementation (two times MPI, one time openMP, and no parallelization option) as discussed here.- Parameters:
input – Pointer to the local input data
output – Pointer to the local output data
reset_output_to_zero – Boolean to set output to zero before multiplication
- Returns:
error_code
-
scalartype EXPECTATION_VALUE_basic(const scalartype *lhs, const scalartype *rhs) const
Low-level function to compute the expectation value using a matrix-free multiplication.
It takes two
const
inputs: lhs (left-hand side) and rhs (right-hand side). Both are pointers that have to be correctly assigned data, e.g., from Vector class. If complex scalar are used, the complex-conjugate of the lhs is used:\[o = \sum_{ij} \texttt{conj}({\mathrm{lhs}}_i)\,\, O_{ij}\,\, \mathrm{rhs}_j \]\(O\) refers to the Operator. The function has four different implementations (two times MPI, one time openMP, and no parallelization option) as discussed here.
- Parameters:
lhs – Left-hand side
rhs – Reft-hand side
- Returns:
Expectation value
Private Members
-
const Op *operator_ptr
Pointer to the operator that is used represented.
-
const uint64_t dim
Dimension.
-
const MPI_Datatype MPI_SCALAR
Datatype used by MPI.
-
const int32_t world_size
Number of MPI processes.
-
const int32_t myrank
MPI rank of this process.
-
const uint64_t mydim
Local dimension of rank.
-
const uint64_t start
Start of local rows enumerated by the underlying basis in operator_ptr.
-
const uint64_t end
End of local rows enumerated by the underlying basis in operator_ptr.
-
const uint64_t number_of_communication_steps
Number of communication steps between all processes for matrix-vector multiplication.
-
const uint64_t maximal_number_of_elements_to_recv
Maximal number of elements to receive during the multiplication.
-
const uint64_t maximal_number_of_elements_to_send
Maximal number of elements to send during the multiplication.
-
const std::vector<uint64_t> dim_per_step
The number of rows that are processed during each communications step. It is identical for each MPI process.
-
const std::vector<uint64_t> total_number_of_elements_to_recv_per_step
Total number of elements that are received in each communication step. It is of size number_of_communication_steps.
-
const std::vector<uint64_t> total_number_of_elements_to_send_per_step
Total number of elements that are sent in each communication step. It is of size number_of_communication_steps.
-
const std::vector<uint64_t> ownership_per_rank
Rows that mark the memory distribution per rank.
Rank i is in charge of rows ownership_per_rank[i-1] to (not including the last) ownership_per_rank[i]. Note that ownership_per_rank[-1] is zero. It is equivalent to the used Vector class.
-
const std::vector<std::vector<uint64_t>> number_of_elements_to_send_per_rank_per_step
Number of elements that are sent in each communication step to each MPI rank.
Size is number_of_communication_steps times world_size. number_of_elements_to_send_per_rank_per_step[i][j] refers to the number of elements that are send in the i th communication step from rank j.
-
const std::vector<std::vector<uint64_t>> number_of_elements_to_recv_per_rank_per_step
Number of elements that are received in each communication step from each MPI rank.
Size is number_of_communication_steps times world_size. number_of_elements_to_recv_per_rank_per_step[i][j] refers to the number of elements that are received in the i th communication step from rank j.
Friends
-
template<class T>
friend PetscErrorCode MATMUL_PETSC(Mat shell, Vec input, Vec output) C-function used by Petsc to perform matrix-vector multiplication.
shell is a MATSHELL object from Petsc that is used for matrix-free multiplication. It has a context pointer that points to a ShellMatrix which provides a full memory layout for the communication pattern and avoids reallocating memory. It simply calls the low-level function MATMUL_basic(…).
- Parameters:
shell – MATSHELL object
input – Input Vec
output – Output Vec
- Returns:
PetscErrorCode