Matrix Chain Multiplication

Get started

🙏
জয় শ্রী রাম
🕉

Dynamic Programming: All possible Cuts in all possible Intervals for the Last Operation is an absolute pre-requite for this chapter.

Problem Statement:

Given a sequence of matrices, find the most efficient way to multiply these matrices together. The problem is not actually to perform the multiplications, but merely to decide in which order to perform the multiplications.

We have many options to multiply a chain of matrices because matrix multiplication is associative. In other words, no matter how we parenthesize the product, the result will be the same. For example, if we had four matrices A, B, C, and D, we would have:

(ABC)D = (AB)(CD) = A(BCD) = ....
However, the order in which we parenthesize the product affects the number of simple arithmetic operations needed to compute the product, or the efficiency.

To multiply a matrix of dimension (m x n) with another matrix of dimension (n x k), we will have to do (m x n x k) operation. So the cost of the multiplication is (m x n x k). If you have hard time understanding it I would highly recommend you revisiting how matrix multiplication works.

For example, suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,

(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
Clearly the first parenthesization requires less number of operations.
Given an array p[] which represents the chain of matrices such that the i^th matrix Ai is of dimension p[i-1] x p[i]. We need to write a function MatrixChainOrder() that should return the minimum number of multiplications needed to multiply the chain.

Input: p[] = {40, 20, 30, 10, 30}
Output: 26000
There are 4 matrices of dimensions 40x20, 20x30, 30x10 and 10x30.
Let the input 4 matrices be A, B, C and D. The minimum number of multiplications are obtained by putting parenthesis in following way:
(A(BC))D --> 20*30*10 + 40*20*10 + 40*10*30

Input: p[] = {10, 20, 30, 40, 30}>br/> Output: 30000
There are 4 matrices of dimensions 10x20, 20x30, 30x40 and 40x30.
Let the input 4 matrices be A, B, C and D. The minimum number of multiplications are obtained by putting parenthesis in following way
((AB)C)D --> 10*20*30 + 10*30*40 + 10*40*30

Input: p[] = {10, 20, 30}
Output: 6000
There are only two matrices of dimensions 10x20 and 20x30. So there is only one way to multiply the matrices, cost of which is 10*20*30

Solution:

This problem can also be named Parenthesization, where open and close parenthesis represents how the multiplication operations would be performed.

This classic DP problem is quite easy if you already gotten the hang of Dynamic Programming concept and you actually know how Dynamic Programming fundamentally works. The first thought that comes to mind is that we should be able to first compute the result for the smaller length sub-expression and using those results we would compute the result of the overall given expression.

Another very related thought automatically occurs: we definitely need to divide the whole expression into two parts and multiply the result of the two parts. This result is the result for the whole expression. Now, Dividing on different position in the expression would give different result. So we need to divide the expression in all possible positions and take the one that gives the most optimal result (in our case we are trying to minimize). Notice how this thought naturally aligns to the core concept of Dynamic Programming. When we are partitioning the given expression into two parts, we are creating shorter length sub-problems. While computing result for the subproblems we do the same thing. So ultimately we would be using already computed results of the subproblems to compute for the given problem, and this is exactly what Optimal Substructure property is. We would see that some of the subproblem results would be used multiple times, which is the manifestation the Overlapping Subproblems property of Dynamic Programming.

Once we have figured that our approach is to compute results for shorter length subproblems first and then eventually solve for the longer length subproblems which would actually lead us to compute for the given problem, it becomes very easy to come up with a bug-free code, because we know a common way of implementing it: Tabulation. We will be solving this in bottom-up approach.

Also notice that, how this problem is using the Dynamic Programming: All possible Cuts in all possible Intervals for the Last Operation concept. We divide the given expression in two halves in all possible way and take the one that gives the minimum answer. So we are basically making cuts at all possible way to see which cut gives the minimum answer. So what this cut implies here: This is the last multiplication operation that would done in the process of evaluating the whole expression. And why last ? Because we have already computed for the two sub-problems created.

Let's look at the code below and you would know what I am trying to say here:

Java code:

Login to Access Content

Python code:

Login to Access Content

Must Read:

Instructor:

Abhishek Dey

Senior SDE | Chief Architect

AWS | Microsoft | University of Florida

View LinkedIn profile