Problem Description | Matrix-Vector Multiplication Using MapReduce
Matrix-Vector Multiplication Using MapReduce We aim to multiply a matrix AA of size m×nm \times n with a vector vv of size nn to get a resultant vector uu of size mm: u[i]=∑j=1nA[i][j]⋅v[j]u[i] = \sum_{j=1}^n A[i][j] \cdot v[j]
Where:
- A[i][j]A[i][j] is the element at the ii-th row and jj-th column of the matrix.
- v[j]v[j] is the jj-th element of the vector.
- u[i]u[i] is the ii-th element of the resulting vector.

This operation can be performed in a distributed manner using the MapReduce framework, which breaks down the computation into the following steps:
1. Input Representation
In a distributed system, data is typically stored as key-value pairs. Here’s how the input data is represented:
Matrix AA
Each element A[i][j]A[i][j] is stored as a
key-value pair: Key: (i,j)Value: A[i][j]\text{Key: } (i, j) \quad \text{Value: } A[i][j]
For example, if AA is: A=[1234]A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}
The matrix representation is:
(0, 0) -> 1
(0, 1) -> 2
(1, 0) -> 3
(1, 1) -> 4
Matrix-Vector Multiplication Using MapReduce
Vector vv
Each element v[j]v[j] is stored as: Key: jValue: v[j]\text{Key: } j \quad \text{Value: } v[j]
For example, if v=[5,6]v = [5, 6], the vector representation is:
0 -> 5
1 -> 6
Matrix-Vector Multiplication Using MapReduce
2. Mapper Phase
Purpose:
The mapper calculates the intermediate product A[i][j]⋅v[j]A[i][j] \cdot v[j] for each element of the matrix and emits it as a key-value pair where the key is ii (the row index of the matrix).
Input to Mapper:
- A matrix element: (i,j)−>A[i][j](i, j) -> A[i][j]
- A vector element: j−>v[j]j -> v[j]
Logic:
For each matrix entry (i,j)(i, j):
- Retrieve the corresponding vector value v[j]v[j].
- Compute the product A[i][j]⋅v[j]A[i][j] \cdot v[j].
- Emit the result as i−>A[i][j]⋅v[j]i -> A[i][j] \cdot v[j].
Output of Mapper:
Key-value pairs where the key is ii (row index) and the value is the product A[i][j]⋅v[j]A[i][j] \cdot v[j].
Example:
For A=[1234]A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} and v=[5,6]v = [5, 6]:
- Mapper reads (0,0)−>1(0, 0) -> 1 and 0−>50 -> 5, emits:
0 -> 5
- Mapper reads (0,1)−>2(0, 1) -> 2 and 1−>61 -> 6, emits:
0 -> 12
- Mapper reads (1,0)−>3(1, 0) -> 3 and 0−>50 -> 5, emits:
1 -> 15
- Mapper reads (1,1)−>4(1, 1) -> 4 and 1−>61 -> 6, emits:
1 -> 24
3. Shuffle and Sort Phase
The framework groups all intermediate values by their keys (row indices). After shuffling and sorting, the intermediate data looks like this:
0 -> [5, 12]
1 -> [15, 24]
Matrix-Vector Multiplication Using MapReduce
4. Reducer Phase
Purpose:
The reducer aggregates all intermediate values for each row ii and calculates the sum to produce the final result u[i]u[i].
Input to Reducer:
- Key: ii (row index)
- Values: A list of intermediate products [A[i][j1]⋅v[j1],A[i][j2]⋅v[j2],…][A[i][j1] \cdot v[j1], A[i][j2] \cdot v[j2], …]
Logic:
- Sum all values in the list.
- Emit the result as i−>sumi -> \text{sum}.
Output of Reducer:
Key-value pairs representing the resulting vector uu: i−>u[i]i -> u[i]
Example:
For the shuffled data:
- Reducer for key
0
receives values[5, 12]
, computes: 5+12=175 + 12 = 17, emits:0 -> 17
- Reducer for key
1
receives values[15, 24]
, computes: 15+24=3915 + 24 = 39, emits:1 -> 39
5. Final Result
The final output is the resultant vector uu: u=[17,39]u = [17, 39]
MapReduce Pseudocode | Matrix-Vector Multiplication Using MapReduce
Mapper:
def mapper(matrix_entry):
i, j, matrix_value = matrix_entry # matrix_entry is (i, j, A[i][j])
vector_value = get_vector_value(j) # Retrieve v[j] from the distributed cache or lookup
product = matrix_value * vector_value
emit(i, product) # Key: row index (i), Value: A[i][j] * v[j]
Reducer:
def reducer(row_index, values):
result = sum(values) # Sum all products for row i
emit(row_index, result) # Key: row index (i), Value: u[i]
Advantages of MapReduce for Matrix-Vector Multiplication
- Scalability: Handles large matrices and vectors by distributing computations across multiple nodes.
- Parallelism: Each mapper processes a subset of matrix rows independently.
- Fault Tolerance: MapReduce handles failures by re-executing failed tasks.
Applications – Matrix-Vector Multiplication Using MapReduce
- Distributed linear algebra computations.
- Machine learning (e.g., gradient descent).
- Scientific simulations and data analysis.
FAQ’s
1. What is the purpose of using MapReduce for matrix-vector multiplication?
MapReduce allows parallel processing of large datasets, making it efficient for distributed systems to handle massive matrices and vectors.
2. How is the matrix represented in MapReduce?
The matrix is represented as key-value pairs, where the key is the row and column index (i,j)(i, j), and the value is A[i][j]A[i][j].
3. What does the mapper emit in this process?
The mapper emits intermediate key-value pairs where the key is the row index ii, and the value is the product A[i][j]⋅v[j]A[i][j] \cdot v[j].
4. What is the role of the reducer?
The reducer aggregates all products for each row ii and computes their sum to generate the corresponding element of the result vector u[i]u[i].
5. Why is MapReduce preferred for large-scale matrix computations?
It distributes computations across multiple nodes, ensuring scalability, fault tolerance, and efficient handling of large-scale data.