Autodocs

template<class Op = danceq::internal::Operator<danceq::internal::BasisU1<danceq::internal::ContainerTable<danceq::internal::State<128, 2>>>, double, danceq::Hamiltonian>>
class ShellMatrix

Given an Operator, the ShellMatrix class provides a matrix-free matrix-vector multiplication.

Template parameters

Things to know

  • The class supports distributed (MPI) and shared (openMP) memory.

    • If available, MPI is used.

    • If this is not the case and openMP is available, it is utilized.

    • If none is available, the code is not parallelized.

  • The syntax for each parallelization option is equivalent.

  • The ShellMatrix should be returned from the Operator class via create_ShellMatrix(…).

  • All member functions are const can not be modified after creation.

  • The code works with the Vector class which distributes memory across all processes if MPI is used. Use the function create_Vector() to ensure the correct memory layout.

  • The following applies only if MPI is utilized and the Vector is distributed across multiple processes. The full communication pattern is analyzed when the ShellMatrix is created. Therefore, additional memory used during the multiplication process to communicate between different processes is only allocated once. The memory overhead due to the communication can be limited during the construction.

  • The debugging level can be adjusted from 0 (no debugging) to 10 (maximal debugging):

    #define dbug_level 10; // 10 maximal debugging, 0 is no debugging
    

Warning

The system size L has to be smaller or equal to the maximal number of sites MaxSites of the underlying State class.

Public Types

using state_class = typename Op::state_class

State class.

State class can be retrieved from the ShellMatrix class:

// This equivalent to: State state;
ShellMatrix::state_class state;  

using scalartype = typename Op::scalartype

Scalar.

The underlying scalar defined from the Operator class can be retrieved from the ShellMatrix class:

ShellMatrix::scalartype A;  

using scalartype_real = decltype(std::real(std::declval<scalartype>()))

Real part of the scalar.

The real part of the underlying scalar defined from the Operator class can be retrieved from the ShellMatrix class:

ShellMatrix::scalartype_real A;  

This is equivalent to scalartype if it is real. If scalartype is complex, e.g., std::complex<double>, scalartype_real is double.

Public Functions

int32_t apply(const danceq::internal::Vector<scalartype> &input, danceq::internal::Vector<scalartype> &output, const bool reset_output_to_zero = true) const

High-level function to perform a matrix-free multiplication using the Vector class.

The function is a wrapper for the low-level function MATMUL_basic(…). In- and output are instances of the Vector class that should be created with create_Vector(). Before the multiplication is executed, the output data might by set to zero using the boolean reset_output_to_zero.

Parameters:
  • input – Input Vector

  • output – Output Vector

  • reset_output_to_zero – Boolean to set output to zero before multiplication

Returns:

error_code

Vector<scalartype> operator*(const Vector<scalartype> &v) const

High-level function to perform a matrix-free multiplication using the Vector class.

The function is a wrapper for the low-level function MATMUL_basic(…). Input is an instance of the Vector class that should be created with create_Vector().

The usage is straightforward:

w = H*v;

Parameters:

v – Input Vector

Returns:

Output vector

scalartype get_expectation_value(const danceq::internal::Vector<scalartype> &lhs, const danceq::internal::Vector<scalartype> &rhs) const

High-level function to compute the expectation value using the Vector class.

The function is a wrapper for the low-level function EXPECTATION_VALUE_basic(…). lhs (rhs) refers to the left(right)-hand side. Both are instances of the Vector class that should be created with create_Vector(). The complex conjugate of the left-hand side is used when the scalars are complex.

Parameters:
  • lhs – Left-hand side

  • rhs – Reft-hand side

Returns:

Expectation value

scalartype get_expectation_value(const danceq::internal::Vector<scalartype> &psi) const

High-level function to compute the expectation value using the Vector class.

The function is a wrapper for get_expectation_value(psi,psi) which calls the low-level function EXPECTATION_VALUE_basic(…). The input is an instance of the Vector class that should be created with create_Vector(). The complex conjugate is used for the left-hand side.

Parameters:

psi – Left- and right-hand side

Returns:

Expectation value

Vector<scalartype> create_Vector(void) const

Creates a Vector class instance compatible with the ShellMatrix.

The function creates an instance of the Vector class that can be used with the computations. This is particularly important when MPI is enabled to ensure the correct memory layout. The Vector class has many additionally features for simple linear algebra routines.

Returns:

Vector

int32_t info(const bool print_output = true, const std::string pre = "") const

Prints info and checks data.

Parameters:
  • print_output – Prints output if true

  • pre – String printed in front of the output

Returns:

error_code

uint64_t get_dim(void) const

Returns dim.

Returns:

dim

uint64_t get_mydim(void) const

Returns the local dimension.

This is only available with MPI.

Returns:

mydim

uint64_t get_start(void) const

Returns the beginning of the rank’s part.

This is only available with MPI.

Returns:

start

uint64_t get_end(void) const

Returns the end of the rank’s part.

This is only available with MPI.

Returns:

end

Protected Functions

ShellMatrix(const Op *operator_ptr_, const int32_t myrank_, const int32_t world_size_, const uint64_t start_, const uint64_t end_, const uint64_t number_of_communication_steps_, const uint64_t maximal_number_of_elements_to_recv_, const uint64_t maximal_number_of_elements_to_send_, const std::vector<uint64_t> dim_per_step_, const std::vector<uint64_t> total_number_of_elements_to_recv_per_step_, const std::vector<uint64_t> total_number_of_elements_to_send_per_step_, const std::vector<uint64_t> ownership_per_rank_, const std::vector<std::vector<uint64_t>> number_of_elements_to_send_per_rank_per_step_, const std::vector<std::vector<uint64_t>> number_of_elements_to_recv_per_rank_per_step_)

Constructor used by the Operator class.

The full communication pattern was generated by the Operator class when calling create_ShellMatrix(…). The memory overhead can be limited by overhead_in_GB_per_core.

The members as listed here.

Parameters:
  • operator_ptr_ – Sets Pointer to the Operator

  • myrank_ – Sets myrank

  • world_size_ – Sets world_size

  • start_ – Sets start

  • end_ – Sets end

  • number_of_communication_steps_ – Sets number_of_communication_steps

  • maximal_number_of_elements_to_recv_ – Sets maximal_number_of_elements_to_recv

  • maximal_number_of_elements_to_send_ – Sets maximal_number_of_elements_to_send

  • dim_per_step_ – Sets dim_per_step

  • total_number_of_elements_to_recv_per_step_ – Sets total_number_of_elements_to_recv_per_step

  • total_number_of_elements_to_send_per_step_ – Sets total_number_of_elements_to_send_per_step

  • ownership_per_rank_ – Sets ownership_per_rank

  • number_of_elements_to_send_per_rank_per_step_ – Sets number_of_elements_to_send_per_rank_per_step

  • number_of_elements_to_recv_per_rank_per_step_ – Sets number_of_elements_to_send_per_rank_per_step

Private Functions

int32_t MATMUL_basic(const scalartype *input, scalartype *output, const bool reset_output_to_zero = true) const

Low-level function to preform the matrix-free matrix-vector multiplication.

This function is the core the ShellMatrix class which preforms the matrix-free multiplication. It takes an input and output pointing to the correctly assigned data. For example, they can point to the data of the Vector class or the data of the internal vector structure Vec of Petsc. Note, that if the data is distributed over different processes with MPI it only points to the local data. The input is marked by const and is, therefore, not modified. Before the multiplication is executed, the output data might by set to zero using the boolean reset_output_to_zero. Not setting the output to zero allows an efficient construction of the Krylov space methods where only two (instead of three) vectors are stored. The function has four different implementation (two times MPI, one time openMP, and no parallelization option) as discussed here.

Parameters:
  • input – Pointer to the local input data

  • output – Pointer to the local output data

  • reset_output_to_zero – Boolean to set output to zero before multiplication

Returns:

error_code

scalartype EXPECTATION_VALUE_basic(const scalartype *lhs, const scalartype *rhs) const

Low-level function to compute the expectation value using a matrix-free multiplication.

It takes two const inputs: lhs (left-hand side) and rhs (right-hand side). Both are pointers that have to be correctly assigned data, e.g., from Vector class. If complex scalar are used, the complex-conjugate of the lhs is used:

\[o = \sum_{ij} \texttt{conj}({\mathrm{lhs}}_i)\,\, O_{ij}\,\, \mathrm{rhs}_j \]

\(O\) refers to the Operator. The function has four different implementations (two times MPI, one time openMP, and no parallelization option) as discussed here.

Parameters:
  • lhs – Left-hand side

  • rhs – Reft-hand side

Returns:

Expectation value

Private Members

const Op *operator_ptr

Pointer to the operator that is used represented.

const uint64_t dim

Dimension.

const MPI_Datatype MPI_SCALAR

Datatype used by MPI.

const int32_t world_size

Number of MPI processes.

const int32_t myrank

MPI rank of this process.

const uint64_t mydim

Local dimension of rank.

const uint64_t start

Start of local rows enumerated by the underlying basis in operator_ptr.

const uint64_t end

End of local rows enumerated by the underlying basis in operator_ptr.

const uint64_t number_of_communication_steps

Number of communication steps between all processes for matrix-vector multiplication.

const uint64_t maximal_number_of_elements_to_recv

Maximal number of elements to receive during the multiplication.

const uint64_t maximal_number_of_elements_to_send

Maximal number of elements to send during the multiplication.

const std::vector<uint64_t> dim_per_step

The number of rows that are processed during each communications step. It is identical for each MPI process.

const std::vector<uint64_t> total_number_of_elements_to_recv_per_step

Total number of elements that are received in each communication step. It is of size number_of_communication_steps.

const std::vector<uint64_t> total_number_of_elements_to_send_per_step

Total number of elements that are sent in each communication step. It is of size number_of_communication_steps.

const std::vector<uint64_t> ownership_per_rank

Rows that mark the memory distribution per rank.

Rank i is in charge of rows ownership_per_rank[i-1] to (not including the last) ownership_per_rank[i]. Note that ownership_per_rank[-1] is zero. It is equivalent to the used Vector class.

const std::vector<std::vector<uint64_t>> number_of_elements_to_send_per_rank_per_step

Number of elements that are sent in each communication step to each MPI rank.

Size is number_of_communication_steps times world_size. number_of_elements_to_send_per_rank_per_step[i][j] refers to the number of elements that are send in the i th communication step from rank j.

const std::vector<std::vector<uint64_t>> number_of_elements_to_recv_per_rank_per_step

Number of elements that are received in each communication step from each MPI rank.

Size is number_of_communication_steps times world_size. number_of_elements_to_recv_per_rank_per_step[i][j] refers to the number of elements that are received in the i th communication step from rank j.

Friends

template<class T>
friend PetscErrorCode MATMUL_PETSC(Mat shell, Vec input, Vec output)

C-function used by Petsc to perform matrix-vector multiplication.

shell is a MATSHELL object from Petsc that is used for matrix-free multiplication. It has a context pointer that points to a ShellMatrix which provides a full memory layout for the communication pattern and avoids reallocating memory. It simply calls the low-level function MATMUL_basic(…).

Parameters:
  • shell – MATSHELL object

  • input – Input Vec

  • output – Output Vec

Returns:

PetscErrorCode