Thousands of years ago, our ancestors had been able to distinguish a single apple from a sack of apples. The single apple and the apples in the sack can be regarded as specific objects, that is, every apple can be identified among the ‘bunch’ of apples, and we can label them as ${\text{Apple}_1,\text{Apple}_2,\dots ,\text{Apple}_n}$, and these labels have:

However, the sack of apples also attracted the attention of human beings. It is known as the set.

Let’s see another example of a set. We, now, have an Incident Response Team that is made up of several Avengers, they are:

But one day, they found they need a more powerful guy to help them learn probability theory. So they invited me to their team, and then, the team became:

Aha, the team now is listed as:

Since we have two ‘Tony’s now, and just according the list above, we can not tell which Tony is me and which one is Anthony Edward Stark(Iron Man).

The problem we came cross represented an essential property of set — there should not be duplicated in a set list. And we have a lot of methods to solve the duplicate-problem, for instance, we can list their family name at the same time, so they are

The same members cannot be counted twice in a set and the different members with the same name in the list are considered as the same one(We do not have other information to tell them apart). Although the duplicate-problem has been solved by adding a family name, the list of the set does not always work.

To represents a specific set, the list should be complete. However, most of the time, we are not able to list a set completely. Here are some examples, which can not list all members in the set and they look like:

- In geometry, we definite circle as: “The set of points which are equidistant from a given point”
- In algebra, “The set of integers which have no other divisors except 1 and itself” is called the set of prime numbers

These two sets seem impossible to be enumerated all elements.

In probability, the notion of set plays a more fundamental role. The notion of a set can also be divided into general kinds of sets as well as concrete ones. For the general things are always difficult to understand, we begin our probability theory with the easier concrete examples:

- a) A bushel of apples
- b) All Possible outcomes when six dice are rolled
- c) All student in the college

Then let’s look at the smaller ones:

- a’) The rotten apples in that bushel
- b’) The situation that six dice show a different face when six dice are rolled
- c’) The mathematics majors of that college

All the examples(a~c and a’~c’) are all sets. These examples and the Avengers’ example have an abstract common — they are all “a bunch of things”.

The examples can help us know what is a set intuitively. However, we are mathematicians, and we should be more mathematical and more precise. A bunch of things can never be a formal definition. So we need to set up a mathematical model for a set.

Every signal thing(object, situation, result, e.t.c) in the bunch will be called a ‘point’. And the bunch of things is called a ‘space’. If you have learned linear algebra, you would be familiar with the word ‘space’. A space in linear algebra is made up of infinity vectors who obey some certain rules. For instance, in the example a), every apple is a point, and, of course, they are different from each other. And then the bushel of apples which contains all the apples is the space. In the same way, we can find the points in the other examples. And each of them is space. To distinguish the space in probability theory and in other situations, we prefix the space by sample. So does the point. They become sample space and sample point.

We use $\Omega$ to denote the sample space, and the sample points in the sample space are denoted by $\omega$. Every point can have a specific name, such that the set in the picture where I drew a sack of things.

Its mathematical model is:

Sets are made up of elements, in other word, spaces consist of points. When all the elements in a new set $S_n$ are all from another set $S_0$, the set $S_n$ is the subset of $S_0$. The sets in examples a’)~ c’) are subsets of examples a)~ c). Two extreme subsets are the biggest one — the set itself, and the smallest one — empty set which has nothing.

The empty set is a special case in set theory and it has its own donation $\emptyset$

“The subset is smaller than the original set and the empty set is the smallest subset”, this description uses an undefined concept, the size of a set, which can be used to identify which one is bigger or smaller among two sets. The number of points in a set is called “size”, and it denotes as:

**The size of**

**is**

**The size of**

**is**

The size has several properties, and here are the most useful three:

- The size should be a non-negative integer
- The size can be infinity
- The size of the empty set is 0

By the way, the natural number can be defined by the size of the set. But that is not our business here.

“Finity” and “infinity” are two general classifications of size. However “countable” and “uncountable” are another two distinct concepts. “Countable” is not equal to “finite”, while “Uncountable” is infinity.

Their precise relation is shown in the Venn Diagram

Definition of Countable/Uncountable: An inﬁnite set $A$ is countable if there is a one-to-one correspondence between the elements of A and the set of natural numbers ${1, 2, 3, \dots}$. A set is uncountable if it is neither ﬁnite nor countable. If we say that a set has at most countably many elements, we mean that the set is either ﬁnite or countable.

^{1}

The most common example of the set which is infinity but countable is the set of all odd numbers and the index of element $n$ can be calculated by:

All the sets above are defined by words. Although we have had a mathematical model, it’s also difficult to find a current method to define set rigidly. What the well-defined set means is that it is possible to tell whether a point belongs to the set or not.

*Belong to* is a relation between point and space. We say “Iron Man” (point) belongs to the Avengers(space). It’s denoted as:

And “not belong to” is denoted as:

The first method that can define the set rigidly is “enumerate”, and we put all points in curly brackets. For example, rolling a six-face dice may get $6$ outcomes as:

Intuitively, rolling two different six-face dice may get $6\times 6$ outcomes:

As we can see, two different dice would produce $6^2=36$ outcomes, and then three different dice would produce $6^3=216$ outcomes. When we roll $6$ dice, we would get a set of size 46656. It is impossible to list all the combinations. However, we can solve this problem through a mathematical model:

This is a simple use of the mathematical expression. However, it does not always work, such as defining the set of all the girls in the world. Listing all the names of them is tedious and impossible, and this set is dynamic because girls are born and died at the present moment.

The second method to determine a set is through a specified rule of membership, however, there are always people who quibbled about the meaning of words. That is meaningless.

Subset has been explained above, but here we have an official definition:

If every point of $A$ belongs to $B$, Then $A$ is contained or included in $B$ and is a

subsetof $B$, and $B$ is the superset of $A$

We can write this relation in two ways:

Two sets are identical if they contain exactly the same points, and then we write

By the way, a roundabout method to check whether two sets are identical is to chech: if and only if $A\subset B$ and $A\supset B$. Although this roundabout way seems indirect, it might be the easiest way to investigate whether these two sets are identical.

This post is the first one of our series of Probability Theory. We talked about some concepts and properties of set and sample set. And the relationship between elements (also known as points) and space (also known as a set) and between set and another set. The subset is also a set, however, it is in a special position in probability theory. Without understanding the concept of subset, we might be confused in later discussion.

^{1}:DeGroot M H, Schervish M J. Probability and statistics[M]. Pearson Education, 2012.

^{2}:Chung K L. Elementary probability theory with stochastic processes[M]. Springer Science & Business Media, 2012.

From now on, assuming that we know nothing about numbers is necessary to make everything clear, during which even $1,2,3,\dots$ are unknown until they are defined formally and precisely.

When we get something from others, no mater what they are, what we concern most is always how many they are. This question is about quantity, so the figure below can be an answer to our question :

Three rows represent three different things:

□. The row I contains ${\text{apple}}$

○. The row II contains ${\text{apple},\text{apple}}$

△. The row III contains ${\text{apple},\text{apple},\text{apple}}$

To answer the question of how many they are, I have drawn three symbles on the right-hand side of the equations, each of which represent the quantity of things at its row. (However here is a little bug that we have not defined what the equeling is ) So:

□. The quantity, how many the apples are, in row I is **rectangular**

○. The quantity, how many the apples are, in row II is **circle**

△. The quantity, how many the apples are, in row III row is **triangle**

Aha, till now, we have already have defined some numbers, and they are ${□,○,△}$ . Although we have only define △ numbers (the quantity of ${□,○,△}$ is △), but we can use this strategy to define as many numbers as you want. So the light might have already brought you that *the number is just a symbol which gives a certain and unique answer to the quesetion — how many things there are* . Even though we surely have abilities and times to define so many symbles that they can answer whatever the quantity question is, it’s too monotonous and inefficient. According this a new idea came to us, how about use just ** a few symbols** , by whom we can create infinite different combinations. This simple idea gives us sufficient tools and materials to build the conceret and elegant number bulding. Then some great forefathers created ${0,1,2,3,\dots}$. However, I have to admit these symbols are more convenient than my ‘gurgles’(The name of baby play, whose heroes are rectangular, circle, and ect in ‘Good luck Charlie’).

How many symbols we are going to use decides what the number system is. If we use just 2 symbols we get a binary number system, and if we use ten symbols we get a decimal number system.

Just as most of human just know decimal numbers, computers only know binary ones( or $2^n$nary ones, like octonary and hexadecimal system) because of their hardware framwork. Binary numbers are expressed as

This may be wired for you if you are not a computer science students. But our all computations on computers, smart phones and e.t.c. are based on binary. Each binary digit is called a bit.

Let’s look some examples, the decimal number 4 can be expressed as $(100.)_2$ in base 2, we can write this in the form:

Translating binary code to the decimal one for us to read is relatively easier than the contrary:

The $2^{n}$ here must be calculated in the decimal system, where $2^2=4, 2^10=1024,\dots$

For example, convert $(10010)_2$ to the decimal number:

The algorithm we wish to discuss next is about how to convert decimal numbers to binary numbers.

We divid the decimal numbers into two parts, integer and fractional parts. For example,

We devide the integer part by $2$ successively until the result is 0 and recording the remainders which will always be $0$ or $1$, like $5\div 2 =2 \dots 1$ where the 2 is the result and 1 is the remainder. The successive recorded numbers are starting at the decimal point(radix may be more accurate)

Then the $53.$ in base 10 is equal to $110101.$ in base 2. To check this result, we can ues formular (1) easily:

Convert $(0.7)_{10}$ to binary by reversing the preceding steps. Multiply by 2 successively and record the integer parts, and move away the integer parts and then go on:

We can notice that the part which is start from $0.4\times 2$ to $0.2\times 2$ will repeat over and over, so the result must be repeat infinitely. So we write it as:

For this we conclude that

Formular 1 has told us how to convert binary nunber into the number in base 10, then we use some little tricks to make the fractional part more concise.

There is no doubt in this proccess, but how should the infinite ones be calculated? Suppose $x=(0.\overline{1011})_2$ let’s convert it to decimal:

Another more complicated example, what is $x=(0.10\overline{101})_2$ in decimal form:

first

for

then we set:

use the same method as last example we can get :

then we can get z from the third formular :

then we can get x from the first formular :

This post we have learned something about binary numbers, how to convert between decimal and binary is the central topic.

- Sauer, T., Columbus, B., New, I., San, Y., Upper, F., River, S., … Tokyo, T. (n.d.). Numerical Analysis. Retrieved from http://www.pearsoned.com/legal/permissions.htm.

their curves are like these:

- $g(x)=0.987862x-0.155271x^3+0.00564312x^5$

- $f(x)=\sin(x)$

- $f(x)=\sin(x)$ and $g(x)=0.987862x-0.155271x^3+0.00564312x^5$

All figures above show how sophisticated polynomials are approximated. However, How to get this polynomial is not the things we will concern in the class, and what we should worry about is how to compute the formula (1) more efficiently.

We, now, change our polynomial (1) to a general one which contains entries of all distinct integer exponents:

What we gonna do is to find the best way to evaluate polynomial (2) at $x=\frac{1}{2}$. Assuming that, the coefficients of the polynomial and the number $\frac{1}{2}$ are always stored in memory or even registers, which is to say that we will not take the transportation time into account.

However, How to measure the efficiency of evaluating is a new problem coming to our faces, however, I got two ways to solve this problem:

The first one is to count the ticks of the entire process, which is in the view of time. For instance, if our algorithm has taken 10 seconds, and the state-of-art algorithm have taken 11 seconds, Our algorithm would be the best for now.

The second is to count the total quantity of the operations of the algorithm, such as, if our algorithm has had 10 multiplications and 5 additions, while the state-of-art algorithm has had 11 multiplications and 5 additions, our algorithm would be better.

In this section, we will measure the efficiency of the algorithm through the second way, count the quantity of the operations.

Now, we go back to our main problem of finding out the best way to evaluate:

In a usual way, we will calculate

There are $4+3+2+1=10$ multiplications and $1+1+1+1=4$ additions(the subtraction here is regarded as the same as addition).

Method one has 10 multiplications and 4 additions.

Smart readers may have found that we have done ‘$\frac{1}{2}\times \frac{1}{2}$’ more times than necessary, while, some result can be stored in the memory instead of computing again and again, which will save many resources of computation. So the method becomes:

That is to say $(\frac{1}{2})^2$ has one multiplication, and $(\frac{1}{2})^3$ has two multiplications, while the $(\frac{1}{2})^3$ has two multiplications as well. So there totally are $2+2+2+1=7$ multiplications and 4 additions(the subtraction here is regarded as the same as an addition).

Method two has 7 multiplications and 4 additions.

Rewrite the polynomial as:

When $x=\frac{1}{2}$, we evaluate the polynomial from the inside out:

$\frac{1}{2}\times 2$ , add $+3\to 4$

$\frac{1}{2}\times 4$ , add $-3\to -1$

$\frac{1}{2}\times (-1)$ , add $+5\to \frac{9}{2}$

$\frac{1}{2}\times \frac{9}{2}$ , add $-1\to \frac{5}{4}$

This method is called ** nested multiplication** or

The example of polynomial evaluation is characteristic of the entire topic of computational methods for scientific computing. While the standard form for a polynomial $c_1+c_2x+c_3x^2+c_4x^3+c_5x^4$ can be written in nested form as:

In chapter 3 we will require the form：

where we call $r_1,r_2,r_3$ and $r_4$ the ** base points**. when we set $r_1=r_2=r_3=r_4=0$, formula (0.8) is recovered to formula (0.7).

Polynomial can be evaluated in a very efficient way through nested multiplication. But, what is more important for us in this subject are these:

Computers are very fast at doing very simple things.

It’s important to do even simple tasks as efficiently as possible.

The best way may not be the obvious way.

- Sauer, T., Columbus, B., New, I., San, Y., Upper, F., River, S., … Tokyo, T. (n.d.). Numerical Analysis. Retrieved from http://www.pearsoned.com/legal/permissions.htm.