The post Sample Set appeared first on Tony's Blog.
]]>Today, We are going to discuss a most fundamental concept of mathematics, of course, as well as of probability theory. It is set.
Thousands of years ago, our ancestors had been able to distinguish a single apple from a sack of apples. The single apple and the apples in the sack can be regarded as specific objects, that is, every apple can be identified among the ‘bunch’ of apples, and we can label them as $\{\text{Apple}_1,\text{Apple}_2,\dots ,\text{Apple}_n\}$, and these labels have:
$$
\text{Apple}_i\neq \text{Apple}_j \text{ where } i\neq j
$$
However, the sack of apples also attracted the attention of human beings. It is known as the set.
Let’s see another example of a set. We, now, have an Incident Response Team that is made up of several Avengers, they are:
$$
\{\text{Natasha},\text{Thor},\text{Steve},\text{Clinton},\text{Tony},\text{Bruce}\}
$$
But one day, they found they need a more powerful guy to help them learn probability theory. So they invited me to their team, and then, the team became:
Aha, the team now is listed as:
$$
\{\text{Natasha},\text{Thor},\text{Steve},\text{Tony},\text{Clinton},\text{Tony},\text{Bruce}\}
$$
Since we have two ‘Tony’s now, and just according the list above, we can not tell which Tony is me and which one is Anthony Edward Stark(Iron Man).
The problem we came cross represented an essential property of set — there should not be duplicated in a set list. And we have a lot of methods to solve the duplicate-problem, for instance, we can list their family name at the same time, so they are
$$
\{\text{Natasha Romanoff},\text{Thor Odinson},\text{Steve Rogers},\text{Tony Tan},\\
\text{Clinton Barton},\text{Tony Stark},\text{Bruce Banner}\}
$$
The same members cannot be counted twice in a set and the different members with the same name in the list are considered as the same one(We do not have other information to tell them apart). Although the duplicate-problem has been solved by adding a family name, the list of the set does not always work.
To represents a specific set, the list should be complete. However, most of the time, we are not able to list a set completely. Here are some examples, which can not list all members in the set and they look like:
These two sets seem impossible to be enumerated all elements.
In probability, the notion of set plays a more fundamental role. The notion of a set can also be divided into general kinds of sets as well as concrete ones. For the general things are always difficult to understand, we begin our probability theory with the easier concrete examples:
Then let’s look at the smaller ones:
– a’) The rotten apples in that bushel
– b’) The situation that six dice show a different face when six dice are rolled
– c’) The mathematics majors of that college
All the examples(a~c and a’~c’) are all sets. These examples and the Avengers’ example have an abstract common — they are all “a bunch of things”.^{1}
The examples can help us know what is a set intuitively. However, we are mathematicians, and we should be more mathematical and more precise. A bunch of things can never be a formal definition. So we need to set up a mathematical model for a set.
Every signal thing(object, situation, result, e.t.c) in the bunch will be called a ‘point’. And the bunch of things is called a ‘space’. If you have learned linear algebra, you would be familiar with the word ‘space’. A space in linear algebra is made up of infinity vectors who obey some certain rules. For instance, in the example a), every apple is a point, and, of course, they are different from each other. And then the bushel of apples which contains all the apples is the space. In the same way, we can find the points in the other examples. And each of them is space. To distinguish the space in probability theory and in other situations, we prefix the space by sample. So does the point. They become sample space and sample point.
We use $\Omega$ to denote the sample space, and the sample points in the sample space are denoted by $\omega$. Every point can have a specific name, such that the set in the picture where I drew a sack of things.
Its mathematical model is:
$$
\Omega=\{\omega_{\text{apple}},\omega_{\text{raining}},\omega_{\text{car}},\omega_{\text{sunshine}},\omega_{\text{ghost}}\}
$$
Sets are made up of elements, in other word, spaces consist of points. When all the elements in a new set $S_n$ are all from another set $S_0$, the set $S_n$ is the subset of $S_0$. The sets in examples a’)~ c’) are subsets of examples a)~ c). Two extreme subsets are the biggest one — the set itself, and the smallest one — empty set which has nothing.
The empty set is a special case in set theory and it has its own donation $\emptyset$
“The subset is smaller than the original set and the empty set is the smallest subset”, this description uses an undefined concept, the size of a set, which can be used to identify which one is bigger or smaller among two sets. The number of points in a set is called “size”, and it denotes as:
$$\vert S\vert$$
The size of
$$
\Omega=\{\omega_{\text{apple}},\omega_{\text{raining}},\omega_{\text{car}},\omega_{\text{sunshine}},\omega_{\text{ghost}}\}
$$
is
$$
\vert \Omega\vert=5
$$
The size of
$$
\Omega_{\text{Avangers}}=\{\text{Natasha Romanoff},\text{Thor Odinson},\text{Steve Rogers},\text{Tony Tan},\\
\text{Clinton Barton},\text{Tony Stark},\text{Bruce Banner}\}
$$
is
$$
\vert\Omega_{\text{Avangers}}\vert=7
$$
The size has several properties, and here are the most useful three:
By the way, the natural number can be defined by the size of the set. But that is not our business here.
“Finity” and “infinity” are two general classifications of size. However “countable” and “uncountable” are another two distinct concepts. “Countable” is not equal to “finite”, while “Uncountable” is infinity.
Their precise relation is shown in the Venn Diagram
Definition of Countable/Uncountable: An inﬁnite set $A$ is countable if there is a one-to-one correspondence between the elements of A and the set of natural numbers $\{1, 2, 3, \dots\}$. A set is uncountable if it is neither ﬁnite nor countable. If we say that a set has at most countably many elements, we mean that the set is either ﬁnite or countable.^{2}
The most common example of the set which is infinity but countable is the set of all odd numbers and the index of element $n$ can be calculated by:
$$
f(n)=\frac{n-1}{2}
$$
All the sets above are defined by words. Although we have had a mathematical model, it’s also difficult to find a current method to define set rigidly. What the well-defined set means is that it is possible to tell whether a point belongs to the set or not.
Belong to is a relation between point and space. We say “Iron Man” (point) belongs to the Avengers(space). It’s denoted as:
$$
\omega_{\text{Iron Man}}\in \Omega_{\text{Avengers}}
$$
And “not belong to” is denoted as:
$$
\omega_{\text{Super Man}}\notin \Omega_{\text{Avengers}}
$$
The first method that can define the set rigidly is “enumerate”, and we put all points in curly brackets. For example, rolling a six-face dice may get $6$ outcomes as:
$$
\{1,2,3,4,5,6\}
$$
Intuitively, rolling two different six-face dice may get $6\times 6$ outcomes:
$$
\begin{aligned}
\{&(1,1),(1,2),(1,3),(1,4),(1,5),(1,6)&\\
&(2,1),(2,2),(2,3),(2,4),(2,5),(2,6)&\\
&(3,1),(3,2),(3,3),(3,4),(3,5),(3,6)&\\
&(4,1),(4,2),(4,3),(4,4),(4,5),(4,6)&\\
&(5,1),(5,2),(5,3),(5,4),(5,5),(5,6)&\\
&(6,1),(6,2),(6,3),(6,4),(6,5),(6,6)&\}
\end{aligned}
$$
As we can see, two different dice would produce $6^2=36$ outcomes, and then three different dice would produce $6^3=216$ outcomes. When we roll $6$ dice, we would get a set of size 46656. It is impossible to list all the combinations. However, we can solve this problem through a mathematical model:
$$
\Omega=\{(s_1,s_2,s_3,s_4,s_5,s_6)|s_j \text{ may be: }1,2,3,4,5,6 \text{ where } 1\leq j\leq 6\}
$$
This is a simple use of the mathematical expression. However, it does not always work, such as defining the set of all the girls in the world. Listing all the names of them is tedious and impossible, and this set is dynamic because girls are born and died at the present moment.
The second method to determine a set is through a specified rule of membership, however, there are always people who quibbled about the meaning of words. That is meaningless.
Subset has been explained above, but here we have an official definition:
If every point of $A$ belongs to $B$, Then $A$ is contained or included in $B$ and is a subset of $B$, and $B$ is the superset of $A$
We can write this relation in two ways:
$$
A\subset B\text{ , } B\supset A
$$
Two sets are identical if they contain exactly the same points, and then we write
$$
A=B
$$
By the way, a roundabout method to check whether two sets are identical is to chech: if and only if $A\subset B$ and $A\supset B$. Although this roundabout way seems indirect, it might be the easiest way to investigate whether these two sets are identical.
This post is the first one of our series of Probability Theory. We talked about some concepts and properties of set and sample set. And the relationship between elements (also known as points) and space (also known as a set) and between set and another set. The subset is also a set, however, it is in a special position in probability theory. Without understanding the concept of subset, we might be confused in later discussion.
The post Sample Set appeared first on Tony's Blog.
]]>The post Binary Numbers appeared first on Tony's Blog.
]]>Today, we are going to talk about numbers, especially binary numbers. We had already had abilities to use numbers even when we were babies; for example, you must have asked your mother for ‘an’ apple, ‘a’ toy or ‘one’ dollar. These words are so normal for everyone that, for a long time, we have never recognized that there is an important and essential mathematic concept behind it. That is a number. Sure enough, though we all have known $1,2,3,\dots$ very well, what is a number might never have been thought by us until we were asked to do that. For we will learn to use binary numbers and decimal numbers simultaneously, during which the concept of number is a key point, we have to go closer to the definition of the number.
From now on, assuming that we know nothing about numbers is necessary to make everything clear, during which even $1,2,3,\dots$ are unknown until they are defined formally and precisely.
When we get something from others, no matter what they are, what we concern most is always how many they are. This question is about quantity so the figure below can be an answer to our question :
Three rows represent three different things:
□. The row I contains $\{\text{apple}\}$
○. The row II contains $\{\text{apple},\text{apple}\}$
△. The row III contains $\{\text{apple},\text{apple},\text{apple}\}$
To answer the question of how many they are, I have drawn three symbols on the right-hand side of the equations, each of which represents the number of things at its row. (However, here is a little bug that we have not defined what the equaling is ) So:
□. The quantity, how many the apples are, in row I is rectangular
○. The quantity, how many the apples are, in row II is circle
△. The quantity, how many the apples are, in row III row is triangle
Aha, till now, we have already have defined some numbers, and they are $\{□,○,△\}$ . Although we have only define △ numbers (the quantity of $\{□,○,△\}$ is △), but we can use this strategy to define as many numbers as you want. So the light might have already brought you that *the number is just a symbol which gives a certain and unique answer to the question — how many things there are *. Even though we surely have abilities and times to define so many symbols that they can answer whatever the quantity question is, it’s too monotonous and inefficient. According to this, a new idea came to us, how about using just ***a few symbols*** , by whom we can create infinite different combinations. This simple idea gives us sufficient tools and materials to build concrete and elegant number building. Then some great forefathers created $\{0,1,2,3,\dots\}$. However, I have to admit these symbols are more convenient than my ‘gurgles'(The name of a baby play, whose heroes are a rectangular, circle, e.c.t. in ‘Good luck Charlie’).
How many symbols we are going to use decides what the number system is. If we use just 2 symbols we get a binary number system, and if we use ten symbols we get a decimal number system.
Just as most of human just know decimal numbers, computers only know binary ones( or $2^n$nary ones, like octonary and hexadecimal system) because of their hardware framwork. Binary numbers are expressed as:
$$
\dots b_2b_1b_0.b_{-1}b_{-2}\dots \text{ where } b_i\in\{0,1\}
$$
This may be wired for you if you are not computer science students. But our all computations on computers, smartphones, and e.t.c. are based on binary. Each binary digit is called a bit.
Let’s look at some examples, the decimal number 4 can be expressed as $(100.)_2$ in base 2, we can write this in the form:
$$
(4)_{10}=(100)_2
$$
Translating binary code to the decimal one for us to read is relatively easier than the contrary:
$$
n_{10}=\dots b_2 2^{2}+b_1 2^{1}+b_0 2^{0}+b_{-1} 2^{-1}+b_{-2} 2^{-2}+\dots \tag{1}
$$
The $2^{n}$ there must be calculated in the decimal system, where $2^2=4, 2^10=1024,\dots$
For example, convert $(10010)_2$ to the decimal number:
$$
1\times 2^4 + 0\times 2^3 + 0\times 2^2 + 1\times 2^1 + 0\times 2^0\\
=16_{10}+0_{10}+0_{10}+2_{10}+0_{10}\\
=18_{10}
$$
The algorithm we wish to discuss next is about how to convert decimal numbers to binary numbers.
We divid the decimal numbers into two parts, integer and fractional parts. For example,
$$
(50.7)_{10}=(50.)_{10}+(0.7)_{10}
$$
We divide the integer part by $2$ successively until the result is 0 and recording the remainders which will always be $0$ or $1$, like $5\div 2 =2 \dots 1$ where the 2 is the result and 1 is the remainder. The successive recorded numbers are starting at the decimal point(radix may be more accurate)
$$
\begin{aligned}
53\div 2&=26 &\dots 1\\
26\div 2&=13 &\dots 0\\
13\div 2&=6 &\dots 1\\
6\div 2&=3 &\dots 0\\
3\div 2&=1 &\dots 1\\
1\div 2&=0 &\dots 1\\
\end{aligned}
$$
Then the $53.$ in base 10 is equal to $110101.$ in base 2. To check this result, we can ues formular (1) easily:
$$
1\times 2^5+1\times 2^4+0\times 2^3+1\times 2^2+0\times 2^1+1\times 2^0=53
$$
Convert $(0.7)_{10}$ to binary by reversing the preceding steps. Multiply by 2 successively and record the integer parts, and move away the integer parts and then go on:
$$
\begin{aligned}
0.7\times 2&=0.4 &+ 1\\
0.4\times 2&=0.8 &+ 0\\
0.8\times 2&=0.6 &+ 1\\
0.6\times 2&=0.2 &+ 1\\
0.2\times 2&=0.4 &+ 0\\
0.4\times 2&=0.8 &+ 0\\
0.8\times 2&=0.6 &+ 1\\
0.6\times 2&=0.2 &+ 1\\
0.2\times 2&=0.4 &+ 0\\
&\vdots&
\end{aligned}
$$
We can notice that the part which is starting from $0.4\times 2$ to $0.2\times 2$ will repeat over and over, so the result must be repeated infinitely. So we write it as:
$$
(0.7)_{10}=(0.1\overline{0110})_{2}
$$
For this, we conclude that
$$
53.7_{10}=(110101.1\overline{0110})_{2}
$$
Formula 1 has told us how to convert binary numbers into the number in base 10, then we use some little tricks to make the fractional part more concise.
$$
\begin{aligned}
(.1011)_2&=1\times(\frac{1}{2})^1+0\times(\frac{1}{2})^2+1\times(\frac{1}{2})^3+1\times(\frac{1}{2})^4\\
&=(\frac{11}{16})_{10}
\end{aligned}
$$
There is no doubt in this proccess, but how should the infinite ones be calculated? Suppose $x=(0.\overline{1011})_2$ let’s convert it to decimal:
$$
\begin{aligned}
x&=0.\overline{1011}\\
2^4x &=1011. \overline{1011}\\
2^4x -x&=1011. \overline{1011}-0.\overline{1011}\\
(16-1)_{10}x&=1011_{2}=11_{10}\\
x&=(\frac{11}{15})_{10}
\end{aligned}
$$
Another more complicated example, what is $x=(0.10\overline{101})_2$ in decimal form:
first
$$
z=2^2x=(10.\overline{101})_2
$$
for
$$
(10)_2=2_{10}
$$
then we set:
$$
y_{10}=(z-2)_{10}=(.\overline{101})_2
$$
use the same method as last example we can get :
$$
(2^3-1)y_{10}=101_2=5_{10}\\
y_{10}=(\frac{5}{7})_{10}
$$
then we can get z from the third formular :
$$
z=y_{10}+2=\frac{19}{7}
$$
then we can get x from the first formular :
$$
x=z_{10}\div 4=\frac{19}{28}
$$
This post we have learned something about binary numbers, how to convert between decimal and binary is the central topic.
The post Binary Numbers appeared first on Tony's Blog.
]]>The post A Brief Introduction to Reinforcement Learning appeared first on Tony's Blog.
]]>Some of you may be confused by the title, for most blogs series or articles are always begin with a ‘real’ introduction. Believe me, I have tried to prepare all the information, which I think they should be known before we learn reinforcement learning algorithms, to make up a really good introduction, but I finally give up for there are such tremendous amounts of aspects to talk about. However, a light bright on me, why not just write a brief one only for the most basic concepts, then at the end present a good survey or an overall summary. So in this article, I will just talk about:
1. What is a Reinforcement Learning
2. Supervised Learning, Unsupervised Learning and Reinforcement Learning
3. Some Basic Concepts
This is always our first question for this subject, we can found more details from Wikipedia ‘Reinforcement Learning’^{1}. However, here I want to present a more readable interpretation: reinforcement learning is a kind of machine learning, whose purpose is to solve problems by approaching the learning process of human beings or other intelligent creatures through computer programs. This long sentence contains three important views:
1. The purpose is to solve problems which we come across and have not ever been solved by already known methods
2. Most of the reinforcement learning ideas come from psychology and neuroscience
3. We simulate all the conditions and exert our algorithms on a(or more) modern computer(s)
All these might be the best I can introduce to you by speaking English, yet some guys might still be confusing, for all the descriptions above are actually just like a normal machine learning, such as linear regression or even not as powerful as the neural networks. So, let’s show the distinctions between reinforcement learning and the other machine learning algorithms.
By the way, if you have a question about why it is called ‘reinforcement learning’, You can find out the great word ‘reinforcement’ from the psychology research (Schultz W (July 2015)^{2}) or just read the WIKIPEDIA ‘Reinforcement’^{3}
supervised Learning is almost the most popular set of methods that we have ever heard about machine learning or artificial intelligence. Machine learning is a bigger concept, and it contains supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning and etc.
Linear classification algorithms are a bunch of the most accessible methods of machine learning to begin with. However, what I mentioned here are all about the easiest ones, say, we have already sampled 6 2-D points from a dataset, and each of them belongs to one of two different classes that we called this class information ‘label’. And then our mission is to find a line or a hyperplane to tell the whole dataset apart to two classes. Though these points may be mixed together, we still have the strategy to accomplish our mission. What we need and what we depend on are the 6 already known points and their classes(label), for instance:
Point Name | Data | Class(Label) |
---|---|---|
$A$ | $(1,2)$ | A |
$A_1$ | $(2,1)$ | A |
$A_2$ | $(2,2)$ | A |
$B$ | $(6,4)$ | B |
$B_1$ | $(4,4)$ | B |
$B_2$ | $(1,2)$ | B |
This is just a naive problem of linear classification, but the unique identity of supervised learning methods are represented by the sentence – ‘What we need and what we depend on are the 6 points and their classes(label)’. The label might be the most desirable information to the solution. And we can easily find some solution to this simple problem, like this:
Depending on the information we have already known, for now, these three lines are all solutions. But which one is better and who will be chosen to be the final classifier is not the content for this blog series. If you want to know more about this issue, you can read the book ‘pattern recognition and machine learning'(Bishop C M(2006)^{4}).
On the contrary, reinforcement learning does not have so much label information, which can teach algorithms what to do or what not to do step by step, and this is the essential features of reinforcement learning as well. However, reinforcement learning has a clear goal as well. For instance, a robot is learning to clean the room, and there is no information or a teacher to tell him which direction to go or how long for the first step to take. The only useful information before the mission has been known is to ‘clean up this room’. And what is going to happen is largely depend on some policies that have been in robot’s ‘brain’ initially. In this view, the way which the reinforcement learning learn is closer to our human’s.
Without the label information does the unsupervised learning be different from both supervised learning and reinforcement learning. The mission of unsupervised learning is to find the hidden structure behind the mess of data. Let’s look at the naive example above again, and the label information has gone:
num | data |
---|---|
$D$ | $(1,2)$ |
$D_1$ | $(2,1)$ |
$D_2$ | $(2,2)$ |
$D_3$ | $(6,4)$ |
$D_4$ | $(4,4)$ |
$D_5$ | $(1,2)$ |
then the data in a 2-D plane are distributing like:
Unsupervised learning mission is to design an algorithm or a strategy to find some structure behind the data. Though this mission can be finished by totally different algorithms, a good one would give us more useful information.
This is a simple solution in the [^figure 3], but not the only one or even the better one.
Though the unsupervised learning does not have any label information as the same as reinforcement learning, which, in other words, both reinforcement learning and unsupervised learning do not have a teacher to teach him what to do next. On the other hand, reinforcement learning must have a clear goal, but that is not necessary for unsupervised learning.
As we have discussed some features of reinforcement learning, by which reinforcement learning is distinct from supervised learning or unsupervised learning, These features make reinforcement learning an isolated part of machine learning. We, now, go into the details of reinforcement learning to learn some basic concepts.
An agent, of course, is not a super spy like Ethan Hunt(‘Mission impossible’ film series) or other guys with super abilities. It can be an animal, a robot, or etc. However, they all have a clear goal, such as the newborn deer just want to stand on its foot, the robot just wants to clean the room and etc. Sure enough, their challenges are not to learn from labeled data or to find some structure behind the data, but by interacting with their environment to reach their clear goals. The environment to the deer perhaps is the gravity of earth, the wind, or even might be the slippery ground. And there definitely are no teachers or something else teaching them what should do and what should not. They have to decide the following actions by themselves and make sure these actions can help them to attain their goals.
Agent must have some abilities. Firstly, Agent should be able to sense their environment, and this is also known as the state of the environment. This is very important to any agents, just like if you had already been standing on your foot, you would not try to do anything else to stand up again. Secondly, the actions of the agent change the environment, like, every action of the robot will make the room different from before. Finally, the agent decides what to do all by itself based on its policies and according to the environment.
This is a brief description of an agent, more details can be found Richard S. Sutton, A. G. B. (2011)^{5}
Everything in the problem is the component of the environment, even the agent is also a part of the environment. A precise definition of environment in the reinforcement learning is not easy and not necessary, and what we should remember is that the agent is and always will be living, sensing and acting in its environment.
These two concepts are so similar that no beginner can tell them apart clearly. I find a view from which we can identify them easily.
The rewards are the real signal we have got or we will get from the environment. For instance, we are playing a multi-armed bandit
what we get from action is the reward, and either win or lose is decided by the machines, or speaking precisely, it’s decided by the environment. And the reward is a constant produced by the machines and won’t be changed by anything for any reason. Reward signal, or reward for short, is a real signal produced by an agent’s interaction and its environment.
On the contrary, value is calculated by a value function, which had designed before the actions. It looks like an oracle, who told you what might happen after each action. In other words, the value is an estimate of an action before it’s really acting.
The goal of an agent is always converted into maximizing the agent’s total rewards in our reinforcement learning algorithms. The reward (signal) is only depended on the action and the environment, but the value can depend on everything, sometimes, it even can be stochastic. However, value functions do still be a piece of very reliable information to help the agent make a decision.
A policy decides the way that the learning agent behaves at a given time. A vivid description is that a policy is a brain or logical system of an agent. While the agent here is always regarded as an algorithm.
Action and state are both elementary concepts that I have mentioned in section Agent and Environment.
There should be some other sections, like a limitation, scope, some suggestion and etc, but I do not think they are useful for a beginner, so I plane to discuss all these sorts of things at the end of the series, like a survey.
This is my first article about reinforcement learning. The concepts are more useful in our future algorithm study, while the distinctions between reinforcement learning and other machine learning algorithms can give us a big map to make us know where we are.
The post A Brief Introduction to Reinforcement Learning appeared first on Tony's Blog.
]]>The post Efficient Metholds for Evaluting Polynomials appeared first on Tony's Blog.
]]>It had been said that the more basic the operations are, the more we can stand to gain by doing it efficiently. Addition and multiplication may be the most basic operations to us all, and so is their combination, the polynomials. Many functions, however, complicated they are, can be approximated by a polynomial. For instance: $\sin(x),x\in[-\pi,\pi]$ can be approximated by
$$
0.987862x-0.155271x^3+0.00564312x^5 \tag{0.1}
$$
their curves are like these:
All figures above show how sophisticated polynomials are approximated. However, How to get this polynomial is not the things we will concern in the class, and what we should worry about is how to compute the formula (1) more efficiently.
We, now, change our polynomial (1) to a general one which contains entries of all distinct integer exponents:
$$
P(x)=2x^4+3x^3-3x^2+5x-1\tag{0.2}
$$
What we gonna do is to find the best way to evaluate polynomial (2) at $x=\frac{1}{2}$. Assuming that, the coefficients of the polynomial and the number $\frac{1}{2}$ are always stored in memory or even registers, which is to say that we will not take the transportation time into account.
However, How to measure the efficiency of evaluating is a new problem coming to our faces, however, I got two ways to solve this problem:
The second is to count the total quantity of the operations of the algorithm, such as, if our algorithm has had 10 multiplications and 5 additions, while the state-of-art algorithm has had 11 multiplications and 5 additions, our algorithm would be better.
In this section, we will measure the efficiency of the algorithm through the second way, count the quantity of the operations.
Now, we go back to our main problem of finding out the best way to evaluate:
$$
P(x)=2x^4+3x^3-3x^2+5x-1\tag{0.3}
$$
In a usual way, we will calculate
$$
\begin{aligned}
P(\frac{1}{2})=&2\times \frac{1}{2}\times \frac{1}{2}\times \frac{1}{2}\times \frac{1}{2}\\&+3\times \frac{1}{2}\times \frac{1}{2}\times \frac{1}{2}\\&-3\times \frac{1}{2}\times \frac{1}{2}\\&+5\times \frac{1}{2}\\&-1
\end{aligned}\tag{0.4}
$$
There are $4+3+2+1=10$ multiplications and $1+1+1+1=4$ additions(the subtraction here is regarded as the same as addition).
Method one has 10 multiplications and 4 additions.
Smart readers may have found that we have done ‘$\frac{1}{2}\times \frac{1}{2}$’ more times than necessary, while, some result can be stored in the memory instead of computing again and again, which will save many resources of computation. So the method becomes:
$$
\begin{aligned}
\frac{1}{2}\times\frac{1}{2}&=(\frac{1}{2})^2\\
\frac{1}{2}\times\frac{1}{2}\times\frac{1}{2}&=(\frac{1}{2})^2\times\frac{1}{2}=(\frac{1}{2})^3\\
\frac{1}{2}\times\frac{1}{2}\times\frac{1}{2}\times\frac{1}{2}&=(\frac{1}{2})^3\times\frac{1}{2}=(\frac{1}{2})^4
\end{aligned}\tag{0.5}
$$
That is to say $(\frac{1}{2})^2$ has one multiplication, and $(\frac{1}{2})^3$ has two multiplications, while the $(\frac{1}{2})^3$ has two multiplications as well. So there totally are $2+2+2+1=7$ multiplications and 4 additions(the subtraction here is regarded as the same as an addition).
Method two has 7 multiplications and 4 additions.
Rewrite the polynomial as:
$$
\begin{aligned}
P(x)&=-1+x(5-3x+3x^2+2x^3)\\
&=-1+x(5+x(-3+3x+2x^2))\\
&=-1+x(5+x(-3+x(3+2x)))\\
&=-1+x\times(5+x\times(-3+x\times(3+x\times 2)))
\end{aligned}\tag{0.6}
$$
When $x=\frac{1}{2}$, we evaluate the polynomial from the inside out:
$\frac{1}{2}\times 4$ , add $-3\to -1$
$\frac{1}{2}\times (-1)$ , add $+5\to \frac{9}{2}$
$\frac{1}{2}\times \frac{9}{2}$ , add $-1\to \frac{5}{4}$
This method is called nested multiplication or Horner’s method, in which there are 4 multiplications and 4 additions. A general degree $d$ polynomial can be evaluated in $d$ multiplications and $d$ additions.
The example of polynomial evaluation is characteristic of the entire topic of computational methods for scientific computing. While the standard form for a polynomial $c_1+c_2x+c_3x^2+c_4x^3+c_5x^4$ can be written in nested form as:
$$
c_1+x(c_2+x(c_3+x(c_4+x(c_5))))\tag{0.7}
$$
In chapter 3 we will require the form：
$$
c_1+(x-r_1)(c_2+(x-r_2)(c_3+(x-r_3)(c_4+(x-r_4)(c_5))))\tag{0.8}
$$
where we call $r_1,r_2,r_3$ and $r_4$ the base points. when we set $r_1=r_2=r_3=r_4=0$, formula (0.8) is recovered to formula (0.7).
Polynomial can be evaluated in a very efficient way through nested multiplication. But, what is more important for us in this subject are these:
Computers are very fast at doing very simple things.
It’s important to do even simple tasks as efficiently as possible.
The best way may not be the obvious way.
The post Efficient Metholds for Evaluting Polynomials appeared first on Tony's Blog.
]]>The post Introduction to Numerical Analysis Blogs appeared first on Tony's Blog.
]]>Hi everyone, I’m Tony. This is my first blog in this new website where contains knowledge all about Artificial Intelligence, such as mathematics, program design, algorithm, and so on.
This series is about ‘Numerical Analysis’, and I use the book written by Timothy Sauer, named ‘Numerical Analysis’.
AI or such sort of subjects have been solving problems by computers, and what we do every day is that: firstly, we translate real-world problems into mathematical problems, and then we tell the computers what they should do, and this process is called ‘programming’ as well. Though you may be the same position as me where our duty is to design algorithms, which exactly is the first step, the head, on the chain. It’s exciting and necessary for every algorithm designer to verify their algorithms. However, we can never verify our algorithms without computation on a computer.
Through this subject, numerical analysis, we are going to make a close study of computations done by modern computers. And we will see the details of machine arithmetic and how a poor-designed calculation ruin the whole project.
In this chapter, we will discuss:
1. Efficient methods for evaluating polynomials
2. Binary number system
3. The effects of the small rounding errors on computations
The most fundamental operations of arithmetic are addition and multiplication. When I read ‘Real Analysis’[Tao, 2006], Prof. Tao builds the first numerical system by defining nature numbers(1,2,3,4…) and then addition and multiplication. We should pay more attention to the most fundamental operations, for the more basic operation is, the more we stand to gain by doing it right. And on the other hand, the more basic operation is, the more times they will be used in our further work.
My English blog career begins from this day, and I’m trying my best to be an AI scientist, which is my whole life dream, even though I have wasted 28 years of my life. And you are welcome to leave me a message or mail me for everything.
The post Introduction to Numerical Analysis Blogs appeared first on Tony's Blog.
]]>