This is the first in a series of posts about problems, algorithms, efficiency, how to compare the difficulty of problems, and calling in on the way to an open problem in mathematics that's literally worth a million dollars.

But first, let's talk about sums.

# Setting the scene with addition

We'll start with some school arithmetic. We learn how to add two numbers that each have a single digit. What is 3 plus 5? What is 4 plus 9? We learn these single digit sums, and if we forget we can just add things up on our fingers (and sometimes our toes).

 In case you're wondering, a numeral is a symbol or name that stands for a number. The number is an idea, the numeral is how we write it.
But when it comes time to add bigger numbers, we run out of fingers and toes, and we are forced to use marks on paper. We can use tally marks, but when you have 643 sheep it's a bit tedious to have to put a mark for each one. It's better to have the numerals, and then we want to be able to do arithmetic.

So we learn long addition. (No, I'm not going to talk about long division!) We learn that we write the two numbers out, one above the other, and we add them in columns a digit at a time. If the total for two of these digits in a column is more than 9 then a "carry" spills over to the next column, and so on.

It's pretty clear that bigger numbers take longer to add, because there is more to do. Twice the number of digits, twice the time taken. You might wonder if there would be a faster way to do this, a question we will return to time and time again, but since we have to look at every digit it seems reasonable that we can't do better.

We could ask about using binary instead of decimal, or base 8, or base 60 (as the Babylonians did), and in fact if you've remembered all of your base 60 single digit sums, then because the numbers are shorter in base 60 it will take less time. But it's still the case that twice the digits will take twice the time.

 Detail: The exact time taken depends on the number of carries. To take that into account we always talk about worst case timings, and the timings are considered to be "upper bounds." A brief note about notation - I will sometimes use a "centre dot" to represent multiplication, as I have done here, sometimes use a "lower dot," sometimes use a "times" sign, and sometimes not have any symbol at all, using juxtaposition. It should, in each case, be unambiguous that I'm performing a multiplication, and my choice will largely be driven by aesthetics, although occasionally the need for clarity and explicitness will override that. However, if you feel strongly enough about this I invite you to email me to discuss it.
Similarly, if we do it in binary each column will be quicker, but on average writing a number in binary takes 3 and a bit times as many digits as it does in base 10, so it will take three and a bit times as long to add up. But it's still that case that twice the digits will take twice the time.

So the time taken to perform the addition is some constant times the number of digits. The constant will be different for different bases, etc, but it will still be:

$T=c{\cdot}S$

where $T$ is the time taken, and $S$ is the size (number of digits) of the numbers involved.

The time taken is linear in the size of the input number(s).

# Moving up to multiplication

Now let's have a look at long multiplication. Here's an example:

._______________3___6___5
._______________8___7___2
.______________=========== \
._______________6__12__10___\___Central
.__________21__42__35________>__Section
.______24__48__40___________/
._____==================== /
.______24__69__88__47__10
.
.______31___8___2___8___0

I've done this is a slightly unusual way to emphasise that the central section of the table is made up of the products of the individual digits of the initial numbers. The final number is made up by adding columns and carrying digits as necessary.

We can ask how long this will take. Well, if we have $n$ digits in the first number, and $m$ digits in the second number, we have $nm$ products in the central section. Then we have up to $m$ rows to add, each with $n$ numbers, but they are skewed, so we have $n+m-1$ columns in total. The exact time taken for two specific numbers will depend on the number of carries we have to perform, but careful analysis (which we won't do here) shows that there are constants $c_2,$ $c_1,$ and $c_0$ such that when $k$ is the number of digits in the larger number, the time taken, $T(k),$ will be given by (or at least will be bounded by):

$T(k)=c_2.k^2+c_1.k+c_0$

That sounds complicated.

But this is where a simplification comes to our rescue - we choose to ignore the constants involved and just look at the general shape of the function. In short, the time taken to multiply will be dictated by the number of cells we have to complete in the above table, and that's proportional to $mn.$ As the numbers to be multiplied get larger and larger, so the time taken gets totally dominated by the first term, $c_2k^2,$ so we ignore the exact value of the constant $c_2$ and say that this algorithm is $k^2$ in the size of the input numbers.

So, in summary, the time taken by this algorithm is quadratic because for two numbers - say they're both the same length - the number of cells in the middle is quadratic, and that's the bulk of the work as numbers get bigger.

# Is there a better way?

When we multiply, clearly we have to look at each digit at least once, so it's certainly true that we can't multiply numbers in better than time that's linear in the size (in digits) of the numbers involved. We've seen here that the usual school algorithm is quadratic, so the question is: can we do better?

As it happens, yes we can. There are algorithms that are significantly faster, but for smaller numbers the extra house-keeping is horrendous and absolutely not worth it. However there are contexts where we are dealing with very, very large numbers - cryptography is one example - and in those cases it really can make a difference. So there are algorithms that are much faster for multiplying really, really large numbers. We'll mention one later.

 In this section we start to look at how we might compare the difficulty of two problems. Is problem $A$ harder than problem $B$? Or is it easier? How can we tell? What does it mean? Here we start to answer those questions.

# Squaring is easier than General Multiplication. Or is it?

We've been looking at general multiplication, but we might wonder if it's easier in the special case of the two numbers being equal. It's obvious that if we can do general multiplication then we can square, but what if we can square things quickly.

 The oracle is a device we use to think about the comparative difficulty of problems. As a thought experiment we say: "What if we could solve problem $P$ instantly - what would that then allow us to do with a small amount or extra work?" The way we express this is to say: "Let's suppose we have an oracle to do this complicated thing - what can we do then?" So here we are saying: "Let's assume we can square numbers instantly and for free. What would the consequences be of that?" The answer is that with a little extra work (which takes linear time) we could then perform arbitrary multiplications. Which is kinda cool. Later we will return to the oracle idea and use it in a much more difficult/complicated setting.

What if we had an oracle to square numbers instantly? Could we use the oracle somehow to multiply any two numbers?

The answer is, perhaps surprising, yes, we can. Here's how.

Start with two numbers, $a$ and $b.$ As an example I'll use 256 and 139, but I suggest you pick your own and play along.

We compute their sum, $s=a+b,$ which in my case is $395.$ We also compute their difference, $d=a-b,$ which in my example is $117.$

The numbers $s$ and $d$ are then sent to our oracle which instantly calculates $s^2$ and $d^2,$ which in my case are $s^2=156025$ and $d^2=13689.$ And now the small piece of magic is that the product $ab$ is the difference between $s^2$ and $d^2,$ all divided by 4. So in this case you can check that $256{\times}139$ is $(s**2-d**2)/4,$ which is $(156025-13689)/4,$ which is 35584.

 Exactly how this works is a useful exercise in elementary algebra.
And it is.

You should try it with your own numbers just to be sure.

 You may find this all faintly ridiculous. But this is a single, simple, concrete example to set up the ideas we'll explore in later articles where the problems we're trying to solve are unfamiliar, much harder, currently unsolved, and of immediate use in the "real world."
So that means that if we have a squaring oracle we can, with just some extra addition, subtraction, and halving, all of which take a linear amount of time, work out any product we want to. So in some sense we don't have to know how to do general multiplication, we can just learn how to square, and then use the above technique. The total time taken will then be the time the oracle takes, plus the time for our extra work.

# Faster Squaring (and hence faster Multiplication)

Now we can see one method for faster squaring. We can square numbers with $k$ digits in time faster than $k^2$ as follows.

 As an example, let's compute $27182^2.$ We split $27182$ into $27000+182\;=\;27\times{10^3}+182$ and then compute: $27^2=729,$ $2{\times}27{\times}182\;=\;9828,$ $182^2=33124.$ Then our result is: $729000000+9828000+33124\;=\;738861124.$
Suppose a number $N$ has $k$ digits, and split $N$ into two pieces, taking $b$ to be the rightmost half of the digits ( or slightly more if there are an odd number of digits) and $a$ being the "top half" of the number, so $N=a{\times}10^h+b$ where $h=k/2$ (rounded up if $k$ is odd). Then we can compute:

\begin{align}N^2&=(a.10^h+b)(a.10^h+b)\\&={a^2}10^{2h}+2ab10^h+b^2\end{align}

 Comparison of $k^2$ versus $k^{1.585}$
So this shows that we can accomplish the squaring of a $k$ digit number with two squarings, one multiplication, a doubling, and some additions. In each case the numbers we are using, $a$ and $b,$ are half the size of the original ($N$). We convert the multiplication of $a$ and $b$ into a squaring process, and then repeat the entire process with these smaller number, $a$ and $b.$ Eventually we get down to single digits, which we assume we can square or multiply in constant time.

 Proving the time taken to be $k^{log_2(3)}$ is beyond the scope of this article, but broadly speaking it goes like this. Let $T(k)$ be the time taken either to square a $k$ digit number or to multiply two $k$ digit numbers. Then $T(k)\;=\;3T(k/2)$ plus linear terms. We can verify that $T(k)\;=\;k^{log_2(3)}$ satisfies that relationship.
Analysing this shows that the time taken to square a number is now proportional to $k^{log_2(3)},$ and $log_2(3)$ is about $1.585$ or so. This power is much smaller than 2, so the time taken grows much more slowly than $k^2.$ The cost is all the extra fiddling about and remembering intermediate results.

But if the numbers are huge to start with, this is definitely worth it. We can now use this to create a faster algorithm for multiplication via squaring. Alternatively, by using this as our inspiration it's also possible to find a similarly convoluted algorithm for multiplication that also runs in time proportional to $k^{1.585}.$

To see how, suppose we want to multiply $a$ and $b,$ and suppose they have $k$ decimal digits. As before, let $h$ be $k/2$ rounded up, and write $a$ and $b$ as $a_1s+a_0$ and $b_1s+b_0,$ where $s=10^h.$ In essence, we write each of $a$ and $b$ as a combination of their upper and lower halves (digit-wise).

 Where these particular sums and products come from is the magic in this method, and is down to the inventive genius of the people who first came up with the technique. You are expected neither to come up with this yourself, nor to see immediately where they come from. To say that sort of thing always feels a little unsatisfactory, but it's one of those tricks that you learn as a trick, and which then becomes more obvious and "natural" as you become more familiar with the area.
If we write $u=a_1b_1$ and $l=a_0b_0$ (for upper and lower products) and we write $d=(a_1+a_0)(b_1+b_0)-(u+l)$ (although at first glance there is no obvious reason why we should do this) then we find that:

$ab=us^2+ds+l$

Thus we have accomplished a general multiplication with three smaller multiply operations and some linear operations. Verifying the details is left to the (possibly mythical) interested reader.

# Clarifying the words we use

So we've talked about Squaring and Multiplying - we think of these as problems to solve, and the algorithms we use are the solutions. But what exactly do we mean by "a problem"?

In the case of squaring (which we've seen is, with a little extra work, the same as multiplying) we have an infinite collection of numbers, and for each number we ask the question: What is its square?

Loosely speaking we can say that:

• The problem of SQR is ...
• a collection of numbers of which ...
• we ask "What is the square"?
 An algorithm is a set of rules and/or processes to follow, which can be given an input, is guaranteed to terminate, and produces an output.
A solution to the problem of SQR is an algorithm which, when given a particular number, will return the square of that number.

Now let's generalise.

• A problem is ...
• a collection of things (which we call "questions" or "instances") ...
• each of which has an associated other thing (which we call the "answer" or "result") ...
• which we want returned to us when we "ask the question".
Sometimes a given question will have a collection of answers, any one of which will do. In the case of "Factoring" (which we talk about shortly) a "Question" is a positive integer of which we want a factor, and the "answer" is simply any non-trivial factor. Some formulations of the factoring problem require all factors to be returned, but that is a detail we need not consider (yet).

 We talk about the "Time Complexity" of an algorithm, being the function that gives us an upper bound for the worst case performance of that algorithm. When we talk about the time complexity of a problem we mean the time complexity of the best possible algorithm. Of course, sometimes we don't know the best possible algorithm, sometimes we only know the best algorithm so far. That being the case, we talk about the currently known time complexity of an algorithm, although we often just say "the complexity" and assume the listener/reader knows that we mean the "currently best known time complexity." Specifically, it is still an open challenge in computing as to what the best possible algorithm is for multiplication.
So:

• The "things" in the collection are called instances or questions,
• The things we get back are called answers, results, or responses.
• A solution to a problem is an algorithm which ...
• when given an instance ...
• returns to us the associated result.
In the specific example of the problem SQR we have:

• The instances of the problem SQR are numbers,
• The results are the squares of those numbers, and
• We have two solutions, two algorithms, which are given above.
Summarising, and at the risk of repeating ourselves somewhat, we take the technical definitions to be that:

• A Problem is a collection of things about which we ask a specific question.
• We call these things instances.
• Each instance has an associated answer or result.
• A Solution for a problem is an algorithm that when given an instance will give back the desired response.
• The Time Complexity of an algorithm is the function that tells us, as a worst case, how long it will take to find the result for a given question.
• We usually just ask for the dominating form of the function.
 When we compute how long an algorithm takes in the worst case we are usually only concerned about the behaviour in the long run, as the inputs get larger and larger. As such, if the time complexity of an algorithm is something like $a_2x^2+a_1x+a_0,$ we don't much care about the $x$ term or the constant term, because in the long run they will be overwhelmed by the $a_2x^2$ term. That's what we mean by the "dominating form" of the function.
So we can talk about the "problem of squaring" as being the (infinite) collection of numbers to be squared. As we have seen above, the time complexity of addition is linear, the time complexity of multiplication is at worst quadratic (although it's actually at worst $k^{1.585},$ and in fact it's the subject of on-going research, and is now known to be better than that), and the time complexity of squaring is effectively the same as that for multiplication.

Finally, the time complexity is, after all, a function, and so far I've talked about it being of an algorithm. So the function needs an input. What is the input?

 Technical point: The notation $O(...)$ is used to denote the upper bound of the time complexity of an algorithm. We then abuse the notation to use it to refer to the time complexity of the best known algorithm for a problem. Using this notation: $O(MUL)\;=\;O(SQR)$
What we are interested in is how long the algorithm takes as a function of the size of the instance it's given. Multiplying numbers takes more and more time as the numbers get bigger. The time complexity is a function of the size of the instances. So for the algorithm for squaring, the time complexity is $M(s)$ where $s$ is the size of the instance. But what is the size of the instance?

The size of the instance is the number of bits required to specify it.

This amounts to saying - it's the number of symbols required to write it down.

On this basis, 123456 is twice the size (as an instance) as 123, and that matches how we deal with it when we multiply by it or square it. It's the number of symbols involved that dictates the amount of work we have to do - and that feels like a reasonable definition.

# The inverse of Multiplication - the problem of Factoring

 Technical point: For some problems we just want a Yes/No result, but for others we want some sort of object. In the case of multiplication we want to know what the product is, in the case of factoring we want to know an specific factor, but for other problems we just want to know "Is this possible"? The latter type of problem is called a "Decision Problem" and they turn out to be critically important. However, it is often the case that an algorithm for a decision problem can be converted into an algorithm to give us an actual result. We will return to this later.
Let's look at another problem: Factoring integers. For this problem we are given an integer and we are asked to find a divisor - also called a "factor." There is a simple algorithm to do this: Start at 2 and see if that's a divisor. If it is, stop, otherwise ask about 3, then 4, then 5, and so on. This process must eventually stop, because every number is divisible by itself. When it does stop, it will have found a (possibly trivial) divisor.

The problem with this algorithm is that it's slow. Really slow. If you have a five decimal digit number then you might have to perform tens of thousands of trial divisions. Indeed, if the number you're given is prime then you will have to try every number up to that.

# In summary

We've shown that if we can square numbers, then with a little more work we can multiply numbers in general. In fact, if we have someone who can magically square numbers for us then with just the extra ability to add, subtract, and divide by a constant we then have the ability to multiply.

In particular, we now have the ideas of how to compare the difficulty of different problems, and how to use a solution to one problem to give us a way to solve another. In short, we are now ready to talk about

Next post:

• What are some problems in NP but not known to be in P?
• How do they compare with FAC?
• How do they compare with each other?
• What is "NP-Complete"?
• Why do we care?
Many thanks to @RealityMinus3/E.A.Williams and others for detailed feedback, which has improved this page enormously.
(none) (none) (none)
TimeComplexity AGentleIntroductionToNPC

## You are here

AGentleIntroductionToTimeComplexity
Divisor
FactoringIntegers
Function
Integer
NPComplete
Polynomial
SquareNumber
SquareRoot
(none) (none) CoDomainOfAFunction
ComplexNumber
DifferenceOfTwoSquares
DomainOfAFunction
FundamentalTheoremOfAlgebra
ImageOfAFunction
ImaginaryNumber
RangeOfAFunction
RationalNumber
RealNumber
RiemannSurface
Root
RootsOfPolynomials
ThisPageNeedsRevisiting
TypesOfNumber
WholeNumber