
Efficient Algorithms and Parallel Implementations for Power Series Multiplication
Abstract
Power series play an important role in solving differential equations and approximating functions. A key operation in manipulating power series is their multiplication. Power series multiplication algorithms working based on a prescribed precision, say $n$ (where $n$ is a natural number), take the first $n$ coefficients of the two power series as input, multiply them, and return the first $n$ coefficients of the product. While these algorithms can be fast, they incur the overhead of recomputing known terms to enhance the product precision. On the other hand, lazy or relaxed multiplication algorithms compute the product terms incrementally. This allows for dynamic updates of product precision without the need to recompute the already known terms. In this thesis, we discuss efficient multiplication algorithms for univariate and multivariate power series, based on various schemes, including the Karatsuba algorithm, a novel partition multiplication technique using FFT, and an evaluation-interpolation strategy, along with their complexity analyses and parallelization opportunities. These algorithms and methods are implemented in C++ and integrated in the BPAS (Basic Polynomial Algebra Subprograms) library. To parallelize the implementations, we use a thread pool with a work-stealing scheduler using modern C++ multithreading techniques. The performance results, comparing the execution times of various algorithms in both serial and parallel modes, are presented and analyzed.