Monograph

# Efficient Algorithms and Parallel Implementations for Power Series Multiplication

## Degree

Master of Science

Computer Science

## Supervisor

Moreno Maza, Marc

## Abstract

Power series play an important role in solving differential equations and approximating functions. A key operation in manipulating power series is their multiplication. Power series multiplication algorithms working based on a prescribed precision, say $n$ (where $n$ is a natural number), take the first $n$ coefficients of the two power series as input, multiply them, and return the first $n$ coefficients of the product. While these algorithms can be fast, they incur the overhead of recomputing known terms to enhance the product precision. On the other hand, lazy or relaxed multiplication algorithms compute the product terms incrementally. This allows for dynamic updates of product precision without the need to recompute the already known terms. In this thesis, we discuss efficient multiplication algorithms for univariate and multivariate power series, based on various schemes, including the Karatsuba algorithm, a novel partition multiplication technique using FFT, and an evaluation-interpolation strategy, along with their complexity analyses and parallelization opportunities. These algorithms and methods are implemented in C++ and integrated in the BPAS (Basic Polynomial Algebra Subprograms) library. To parallelize the implementations, we use a thread pool with a work-stealing scheduler using modern C++ multithreading techniques. The performance results, comparing the execution times of various algorithms in both serial and parallel modes, are presented and analyzed.

## Summary for Lay Audience

Power series are polynomial-like objects with potentially infinite terms and are utilized in various mathematical areas, such as function approximation and solving differential equations. For example, computers approximate transcendental functions like $$\sin(x)$$ by summing the terms of a power series. Our focus is on the multiplication of power series, which forms the basis for other mathematical operations. There are two primary approaches to multiplying two formal power series, $$f$$ and $$g$$. Methods working in prescribed precision involve expanding $$f$$ and $$g$$ up to a predetermined order, multiplying these expansions, and then truncating the product. While these methods can use fast multiplication algorithms, such as the Fast Fourier Transform (FFT), they lack the flexibility to reuse previous computations when higher precision is required. In contrast, lazy and relaxed methods compute the product terms incrementally, allowing for dynamic precision improvements. In this thesis, we explore and implement various static and relaxed algorithms and strategies for multiplying both univariate and multivariate power series. We identify opportunities for concurrency within these algorithms and parallelize them using appropriate parallel programming techniques. The performance results of these implementations are presented and discussed.

COinS