The post Binary Numbers appeared first on Tony's Blog.
]]>Today, we are going to talk about numbers, especially binary numbers. We had already had abilities to use numbers even when we were babies; for example, you must have asked your mother for ‘an’ apple, ‘a’ toy or ‘one’ dollar. These words are so normal for everyone that, for a long time, we have never recognized that there is an important and essential mathematic concept behind it. That is a number. Sure enough, though we all have known $1,2,3,\dots$ very well, what is a number might never have been thought by us until we were asked to do that. For we will learn to use binary numbers and decimal numbers simultaneously, during which the concept of number is a key point, we have to go closer to the definition of the number.
From now on, assuming that we know nothing about numbers is necessary to make everything clear, during which even $1,2,3,\dots$ are unknown until they are defined formally and precisely.
When we get something from others, no matter what they are, what we concern most is always how many they are. This question is about quantity so the figure below can be an answer to our question :
Three rows represent three different things:
□. The row I contains $\{\text{apple}\}$
○. The row II contains $\{\text{apple},\text{apple}\}$
△. The row III contains $\{\text{apple},\text{apple},\text{apple}\}$
To answer the question of how many they are, I have drawn three symbols on the right-hand side of the equations, each of which represents the number of things at its row. (However, here is a little bug that we have not defined what the equaling is ) So:
□. The quantity, how many the apples are, in row I is rectangular
○. The quantity, how many the apples are, in row II is circle
△. The quantity, how many the apples are, in row III row is triangle
Aha, till now, we have already have defined some numbers, and they are $\{□,○,△\}$ . Although we have only define △ numbers (the quantity of $\{□,○,△\}$ is △), but we can use this strategy to define as many numbers as you want. So the light might have already brought you that *the number is just a symbol which gives a certain and unique answer to the question — how many things there are *. Even though we surely have abilities and times to define so many symbols that they can answer whatever the quantity question is, it’s too monotonous and inefficient. According to this, a new idea came to us, how about using just ***a few symbols*** , by whom we can create infinite different combinations. This simple idea gives us sufficient tools and materials to build concrete and elegant number building. Then some great forefathers created $\{0,1,2,3,\dots\}$. However, I have to admit these symbols are more convenient than my ‘gurgles'(The name of a baby play, whose heroes are a rectangular, circle, e.c.t. in ‘Good luck Charlie’).
How many symbols we are going to use decides what the number system is. If we use just 2 symbols we get a binary number system, and if we use ten symbols we get a decimal number system.
Just as most of human just know decimal numbers, computers only know binary ones( or $2^n$nary ones, like octonary and hexadecimal system) because of their hardware framwork. Binary numbers are expressed as:
$$
\dots b_2b_1b_0.b_{-1}b_{-2}\dots \text{ where } b_i\in\{0,1\}
$$
This may be wired for you if you are not computer science students. But our all computations on computers, smartphones, and e.t.c. are based on binary. Each binary digit is called a bit.
Let’s look at some examples, the decimal number 4 can be expressed as $(100.)_2$ in base 2, we can write this in the form:
$$
(4)_{10}=(100)_2
$$
Translating binary code to the decimal one for us to read is relatively easier than the contrary:
$$
n_{10}=\dots b_2 2^{2}+b_1 2^{1}+b_0 2^{0}+b_{-1} 2^{-1}+b_{-2} 2^{-2}+\dots \tag{1}
$$
The $2^{n}$ there must be calculated in the decimal system, where $2^2=4, 2^10=1024,\dots$
For example, convert $(10010)_2$ to the decimal number:
$$
1\times 2^4 + 0\times 2^3 + 0\times 2^2 + 1\times 2^1 + 0\times 2^0\\
=16_{10}+0_{10}+0_{10}+2_{10}+0_{10}\\
=18_{10}
$$
The algorithm we wish to discuss next is about how to convert decimal numbers to binary numbers.
We divid the decimal numbers into two parts, integer and fractional parts. For example,
$$
(50.7)_{10}=(50.)_{10}+(0.7)_{10}
$$
We divide the integer part by $2$ successively until the result is 0 and recording the remainders which will always be $0$ or $1$, like $5\div 2 =2 \dots 1$ where the 2 is the result and 1 is the remainder. The successive recorded numbers are starting at the decimal point(radix may be more accurate)
$$
\begin{aligned}
53\div 2&=26 &\dots 1\\
26\div 2&=13 &\dots 0\\
13\div 2&=6 &\dots 1\\
6\div 2&=3 &\dots 0\\
3\div 2&=1 &\dots 1\\
1\div 2&=0 &\dots 1\\
\end{aligned}
$$
Then the $53.$ in base 10 is equal to $110101.$ in base 2. To check this result, we can ues formular (1) easily:
$$
1\times 2^5+1\times 2^4+0\times 2^3+1\times 2^2+0\times 2^1+1\times 2^0=53
$$
Convert $(0.7)_{10}$ to binary by reversing the preceding steps. Multiply by 2 successively and record the integer parts, and move away the integer parts and then go on:
$$
\begin{aligned}
0.7\times 2&=0.4 &+ 1\\
0.4\times 2&=0.8 &+ 0\\
0.8\times 2&=0.6 &+ 1\\
0.6\times 2&=0.2 &+ 1\\
0.2\times 2&=0.4 &+ 0\\
0.4\times 2&=0.8 &+ 0\\
0.8\times 2&=0.6 &+ 1\\
0.6\times 2&=0.2 &+ 1\\
0.2\times 2&=0.4 &+ 0\\
&\vdots&
\end{aligned}
$$
We can notice that the part which is starting from $0.4\times 2$ to $0.2\times 2$ will repeat over and over, so the result must be repeated infinitely. So we write it as:
$$
(0.7)_{10}=(0.1\overline{0110})_{2}
$$
For this, we conclude that
$$
53.7_{10}=(110101.1\overline{0110})_{2}
$$
Formula 1 has told us how to convert binary numbers into the number in base 10, then we use some little tricks to make the fractional part more concise.
$$
\begin{aligned}
(.1011)_2&=1\times(\frac{1}{2})^1+0\times(\frac{1}{2})^2+1\times(\frac{1}{2})^3+1\times(\frac{1}{2})^4\\
&=(\frac{11}{16})_{10}
\end{aligned}
$$
There is no doubt in this proccess, but how should the infinite ones be calculated? Suppose $x=(0.\overline{1011})_2$ let’s convert it to decimal:
$$
\begin{aligned}
x&=0.\overline{1011}\\
2^4x &=1011. \overline{1011}\\
2^4x -x&=1011. \overline{1011}-0.\overline{1011}\\
(16-1)_{10}x&=1011_{2}=11_{10}\\
x&=(\frac{11}{15})_{10}
\end{aligned}
$$
Another more complicated example, what is $x=(0.10\overline{101})_2$ in decimal form:
first
$$
z=2^2x=(10.\overline{101})_2
$$
for
$$
(10)_2=2_{10}
$$
then we set:
$$
y_{10}=(z-2)_{10}=(.\overline{101})_2
$$
use the same method as last example we can get :
$$
(2^3-1)y_{10}=101_2=5_{10}\\
y_{10}=(\frac{5}{7})_{10}
$$
then we can get z from the third formular :
$$
z=y_{10}+2=\frac{19}{7}
$$
then we can get x from the first formular :
$$
x=z_{10}\div 4=\frac{19}{28}
$$
This post we have learned something about binary numbers, how to convert between decimal and binary is the central topic.
The post Binary Numbers appeared first on Tony's Blog.
]]>The post A Brief Introduction to Reinforcement Learning appeared first on Tony's Blog.
]]>Some of you may be confused by the title, for most blogs series or articles are always begin with a ‘real’ introduction. Believe me, I have tried to prepare all the information, which I think they should be known before we learn reinforcement learning algorithms, to make up a really good introduction, but I finally give up for there are such tremendous amounts of aspects to talk about. However, a light bright on me, why not just write a brief one only for the most basic concepts, then at the end present a good survey or an overall summary. So in this article, I will just talk about:
1. What is a Reinforcement Learning
2. Supervised Learning, Unsupervised Learning and Reinforcement Learning
3. Some Basic Concepts
This is always our first question for this subject, we can found more details from Wikipedia ‘Reinforcement Learning’^{1}. However, here I want to present a more readable interpretation: reinforcement learning is a kind of machine learning, whose purpose is to solve problems by approaching the learning process of human beings or other intelligent creatures through computer programs. This long sentence contains three important views:
1. The purpose is to solve problems which we come across and have not ever been solved by already known methods
2. Most of the reinforcement learning ideas come from psychology and neuroscience
3. We simulate all the conditions and exert our algorithms on a(or more) modern computer(s)
All these might be the best I can introduce to you by speaking English, yet some guys might still be confusing, for all the descriptions above are actually just like a normal machine learning, such as linear regression or even not as powerful as the neural networks. So, let’s show the distinctions between reinforcement learning and the other machine learning algorithms.
By the way, if you have a question about why it is called ‘reinforcement learning’, You can find out the great word ‘reinforcement’ from the psychology research (Schultz W (July 2015)^{2}) or just read the WIKIPEDIA ‘Reinforcement’^{3}
supervised Learning is almost the most popular set of methods that we have ever heard about machine learning or artificial intelligence. Machine learning is a bigger concept, and it contains supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning and etc.
Linear classification algorithms are a bunch of the most accessible methods of machine learning to begin with. However, what I mentioned here are all about the easiest ones, say, we have already sampled 6 2-D points from a dataset, and each of them belongs to one of two different classes that we called this class information ‘label’. And then our mission is to find a line or a hyperplane to tell the whole dataset apart to two classes. Though these points may be mixed together, we still have the strategy to accomplish our mission. What we need and what we depend on are the 6 already known points and their classes(label), for instance:
Point Name | Data | Class(Label) |
---|---|---|
$A$ | $(1,2)$ | A |
$A_1$ | $(2,1)$ | A |
$A_2$ | $(2,2)$ | A |
$B$ | $(6,4)$ | B |
$B_1$ | $(4,4)$ | B |
$B_2$ | $(1,2)$ | B |
This is just a naive problem of linear classification, but the unique identity of supervised learning methods are represented by the sentence – ‘What we need and what we depend on are the 6 points and their classes(label)’. The label might be the most desirable information to the solution. And we can easily find some solution to this simple problem, like this:
Depending on the information we have already known, for now, these three lines are all solutions. But which one is better and who will be chosen to be the final classifier is not the content for this blog series. If you want to know more about this issue, you can read the book ‘pattern recognition and machine learning'(Bishop C M(2006)^{4}).
On the contrary, reinforcement learning does not have so much label information, which can teach algorithms what to do or what not to do step by step, and this is the essential features of reinforcement learning as well. However, reinforcement learning has a clear goal as well. For instance, a robot is learning to clean the room, and there is no information or a teacher to tell him which direction to go or how long for the first step to take. The only useful information before the mission has been known is to ‘clean up this room’. And what is going to happen is largely depend on some policies that have been in robot’s ‘brain’ initially. In this view, the way which the reinforcement learning learn is closer to our human’s.
Without the label information does the unsupervised learning be different from both supervised learning and reinforcement learning. The mission of unsupervised learning is to find the hidden structure behind the mess of data. Let’s look at the naive example above again, and the label information has gone:
num | data |
---|---|
$D$ | $(1,2)$ |
$D_1$ | $(2,1)$ |
$D_2$ | $(2,2)$ |
$D_3$ | $(6,4)$ |
$D_4$ | $(4,4)$ |
$D_5$ | $(1,2)$ |
then the data in a 2-D plane are distributing like:
Unsupervised learning mission is to design an algorithm or a strategy to find some structure behind the data. Though this mission can be finished by totally different algorithms, a good one would give us more useful information.
This is a simple solution in the [^figure 3], but not the only one or even the better one.
Though the unsupervised learning does not have any label information as the same as reinforcement learning, which, in other words, both reinforcement learning and unsupervised learning do not have a teacher to teach him what to do next. On the other hand, reinforcement learning must have a clear goal, but that is not necessary for unsupervised learning.
As we have discussed some features of reinforcement learning, by which reinforcement learning is distinct from supervised learning or unsupervised learning, These features make reinforcement learning an isolated part of machine learning. We, now, go into the details of reinforcement learning to learn some basic concepts.
An agent, of course, is not a super spy like Ethan Hunt(‘Mission impossible’ film series) or other guys with super abilities. It can be an animal, a robot, or etc. However, they all have a clear goal, such as the newborn deer just want to stand on its foot, the robot just wants to clean the room and etc. Sure enough, their challenges are not to learn from labeled data or to find some structure behind the data, but by interacting with their environment to reach their clear goals. The environment to the deer perhaps is the gravity of earth, the wind, or even might be the slippery ground. And there definitely are no teachers or something else teaching them what should do and what should not. They have to decide the following actions by themselves and make sure these actions can help them to attain their goals.
Agent must have some abilities. Firstly, Agent should be able to sense their environment, and this is also known as the state of the environment. This is very important to any agents, just like if you had already been standing on your foot, you would not try to do anything else to stand up again. Secondly, the actions of the agent change the environment, like, every action of the robot will make the room different from before. Finally, the agent decides what to do all by itself based on its policies and according to the environment.
This is a brief description of an agent, more details can be found Richard S. Sutton, A. G. B. (2011)^{5}
Everything in the problem is the component of the environment, even the agent is also a part of the environment. A precise definition of environment in the reinforcement learning is not easy and not necessary, and what we should remember is that the agent is and always will be living, sensing and acting in its environment.
These two concepts are so similar that no beginner can tell them apart clearly. I find a view from which we can identify them easily.
The rewards are the real signal we have got or we will get from the environment. For instance, we are playing a multi-armed bandit
what we get from action is the reward, and either win or lose is decided by the machines, or speaking precisely, it’s decided by the environment. And the reward is a constant produced by the machines and won’t be changed by anything for any reason. Reward signal, or reward for short, is a real signal produced by an agent’s interaction and its environment.
On the contrary, value is calculated by a value function, which had designed before the actions. It looks like an oracle, who told you what might happen after each action. In other words, the value is an estimate of an action before it’s really acting.
The goal of an agent is always converted into maximizing the agent’s total rewards in our reinforcement learning algorithms. The reward (signal) is only depended on the action and the environment, but the value can depend on everything, sometimes, it even can be stochastic. However, value functions do still be a piece of very reliable information to help the agent make a decision.
A policy decides the way that the learning agent behaves at a given time. A vivid description is that a policy is a brain or logical system of an agent. While the agent here is always regarded as an algorithm.
Action and state are both elementary concepts that I have mentioned in section Agent and Environment.
There should be some other sections, like a limitation, scope, some suggestion and etc, but I do not think they are useful for a beginner, so I plane to discuss all these sorts of things at the end of the series, like a survey.
This is my first article about reinforcement learning. The concepts are more useful in our future algorithm study, while the distinctions between reinforcement learning and other machine learning algorithms can give us a big map to make us know where we are.
The post A Brief Introduction to Reinforcement Learning appeared first on Tony's Blog.
]]>The post Efficient Metholds for Evaluting Polynomials appeared first on Tony's Blog.
]]>It had been said that the more basic the operations are, the more we can stand to gain by doing it efficiently. Addition and multiplication may be the most basic operations to us all, and so is their combination, the polynomials. Many functions, however, complicated they are, can be approximated by a polynomial. For instance: $\sin(x),x\in[-\pi,\pi]$ can be approximated by
$$
0.987862x-0.155271x^3+0.00564312x^5 \tag{0.1}
$$
their curves are like these:
All figures above show how sophisticated polynomials are approximated. However, How to get this polynomial is not the things we will concern in the class, and what we should worry about is how to compute the formula (1) more efficiently.
We, now, change our polynomial (1) to a general one which contains entries of all distinct integer exponents:
$$
P(x)=2x^4+3x^3-3x^2+5x-1\tag{0.2}
$$
What we gonna do is to find the best way to evaluate polynomial (2) at $x=\frac{1}{2}$. Assuming that, the coefficients of the polynomial and the number $\frac{1}{2}$ are always stored in memory or even registers, which is to say that we will not take the transportation time into account.
However, How to measure the efficiency of evaluating is a new problem coming to our faces, however, I got two ways to solve this problem:
The second is to count the total quantity of the operations of the algorithm, such as, if our algorithm has had 10 multiplications and 5 additions, while the state-of-art algorithm has had 11 multiplications and 5 additions, our algorithm would be better.
In this section, we will measure the efficiency of the algorithm through the second way, count the quantity of the operations.
Now, we go back to our main problem of finding out the best way to evaluate:
$$
P(x)=2x^4+3x^3-3x^2+5x-1\tag{0.3}
$$
In a usual way, we will calculate
$$
\begin{aligned}
P(\frac{1}{2})=&2\times \frac{1}{2}\times \frac{1}{2}\times \frac{1}{2}\times \frac{1}{2}\\&+3\times \frac{1}{2}\times \frac{1}{2}\times \frac{1}{2}\\&-3\times \frac{1}{2}\times \frac{1}{2}\\&+5\times \frac{1}{2}\\&-1
\end{aligned}\tag{0.4}
$$
There are $4+3+2+1=10$ multiplications and $1+1+1+1=4$ additions(the subtraction here is regarded as the same as addition).
Method one has 10 multiplications and 4 additions.
Smart readers may have found that we have done ‘$\frac{1}{2}\times \frac{1}{2}$’ more times than necessary, while, some result can be stored in the memory instead of computing again and again, which will save many resources of computation. So the method becomes:
$$
\begin{aligned}
\frac{1}{2}\times\frac{1}{2}&=(\frac{1}{2})^2\\
\frac{1}{2}\times\frac{1}{2}\times\frac{1}{2}&=(\frac{1}{2})^2\times\frac{1}{2}=(\frac{1}{2})^3\\
\frac{1}{2}\times\frac{1}{2}\times\frac{1}{2}\times\frac{1}{2}&=(\frac{1}{2})^3\times\frac{1}{2}=(\frac{1}{2})^4
\end{aligned}\tag{0.5}
$$
That is to say $(\frac{1}{2})^2$ has one multiplication, and $(\frac{1}{2})^3$ has two multiplications, while the $(\frac{1}{2})^3$ has two multiplications as well. So there totally are $2+2+2+1=7$ multiplications and 4 additions(the subtraction here is regarded as the same as an addition).
Method two has 7 multiplications and 4 additions.
Rewrite the polynomial as:
$$
\begin{aligned}
P(x)&=-1+x(5-3x+3x^2+2x^3)\\
&=-1+x(5+x(-3+3x+2x^2))\\
&=-1+x(5+x(-3+x(3+2x)))\\
&=-1+x\times(5+x\times(-3+x\times(3+x\times 2)))
\end{aligned}\tag{0.6}
$$
When $x=\frac{1}{2}$, we evaluate the polynomial from the inside out:
$\frac{1}{2}\times 4$ , add $-3\to -1$
$\frac{1}{2}\times (-1)$ , add $+5\to \frac{9}{2}$
$\frac{1}{2}\times \frac{9}{2}$ , add $-1\to \frac{5}{4}$
This method is called nested multiplication or Horner’s method, in which there are 4 multiplications and 4 additions. A general degree $d$ polynomial can be evaluated in $d$ multiplications and $d$ additions.
The example of polynomial evaluation is characteristic of the entire topic of computational methods for scientific computing. While the standard form for a polynomial $c_1+c_2x+c_3x^2+c_4x^3+c_5x^4$ can be written in nested form as:
$$
c_1+x(c_2+x(c_3+x(c_4+x(c_5))))\tag{0.7}
$$
In chapter 3 we will require the form：
$$
c_1+(x-r_1)(c_2+(x-r_2)(c_3+(x-r_3)(c_4+(x-r_4)(c_5))))\tag{0.8}
$$
where we call $r_1,r_2,r_3$ and $r_4$ the base points. when we set $r_1=r_2=r_3=r_4=0$, formula (0.8) is recovered to formula (0.7).
Polynomial can be evaluated in a very efficient way through nested multiplication. But, what is more important for us in this subject are these:
Computers are very fast at doing very simple things.
It’s important to do even simple tasks as efficiently as possible.
The best way may not be the obvious way.
The post Efficient Metholds for Evaluting Polynomials appeared first on Tony's Blog.
]]>The post Introduction to Numerical Analysis Blogs appeared first on Tony's Blog.
]]>Hi everyone, I’m Tony. This is my first blog in this new website where contains knowledge all about Artificial Intelligence, such as mathematics, program design, algorithm, and so on.
This series is about ‘Numerical Analysis’, and I use the book written by Timothy Sauer, named ‘Numerical Analysis’.
AI or such sort of subjects have been solving problems by computers, and what we do every day is that: firstly, we translate real-world problems into mathematical problems, and then we tell the computers what they should do, and this process is called ‘programming’ as well. Though you may be the same position as me where our duty is to design algorithms, which exactly is the first step, the head, on the chain. It’s exciting and necessary for every algorithm designer to verify their algorithms. However, we can never verify our algorithms without computation on a computer.
Through this subject, numerical analysis, we are going to make a close study of computations done by modern computers. And we will see the details of machine arithmetic and how a poor-designed calculation ruin the whole project.
In this chapter, we will discuss:
1. Efficient methods for evaluating polynomials
2. Binary number system
3. The effects of the small rounding errors on computations
The most fundamental operations of arithmetic are addition and multiplication. When I read ‘Real Analysis’[Tao, 2006], Prof. Tao builds the first numerical system by defining nature numbers(1,2,3,4…) and then addition and multiplication. We should pay more attention to the most fundamental operations, for the more basic operation is, the more we stand to gain by doing it right. And on the other hand, the more basic operation is, the more times they will be used in our further work.
My English blog career begins from this day, and I’m trying my best to be an AI scientist, which is my whole life dream, even though I have wasted 28 years of my life. And you are welcome to leave me a message or mail me for everything.
The post Introduction to Numerical Analysis Blogs appeared first on Tony's Blog.
]]>