AARMS-CMS Student Poster Session
- RAHUL PADMANABHAN, Concordia University
Approximating Matrix Functions Using Transformers [PDF]
-
Transformers are at the forefront of cutting edge artificial intelligence today.
Transformers are used in natural language processing, notably in large language models (LLMs) such as ChatGPT, Llama, etc.
It has been determined that transformers can perform certain linear algebra computations by learning from randomized generated data.
However, there are few studies indicating the extent to which they can be used for advanced numerical computations. We explore the use of transformers in approximating matrix functions.
Matrix functions are an extension of regular functions to matrices, where, matrices are taken in as input parameters and the output is a resulting matrix.
As transformers blocks are mathematically represented as a parameterized function $f_\theta: \mathbb{R}^{n \times d} \rightarrow \mathbb{R}^{n \times d}$ ,
we represent real numbers in the matrix using encoding schemes to obtain results from the transformer.
Our objective is to determine whether transformers can be used to approximate matrix functions by learning from randomized encoded data.
Specifically, we focus on certain problems in the domain of functions of matrices, some of which are matrix powers, the $p^{th}$ root of a matrix and matrix exponentials.
In this poster, after providing the necessary background on transformers and matrix functions, we describe our methodology for approximating matrix functions using transformers
and discuss the results of our numerical experiments.