Here’s a cute proof of the well-known puzzle involving 2 water jugs. The proof is worded very tersely, but it’s still pretty cool.

http://www.futilitycloset.com/2013/02/14/well-done/

Reply

Here’s a cute proof of the well-known puzzle involving 2 water jugs. The proof is worded very tersely, but it’s still pretty cool.

http://www.futilitycloset.com/2013/02/14/well-done/

Advertisements

Why Nate Silver Can Save Math Education in America

I heard somewhere that Nate Silver is thinking of analyzing teacher assessment at some point. It’s a big problem, there’s no definitive answer, and there’s lots of data (though since most of it is standardized test scores, I find much of it questionable). Sounds like a place he can have fun.

That’s not what this article is about.

This article is about having math superstars being rolemodels for kids, to answer their questions of, “when am I ever going to use this?” I feel this is treating the symptoms rather than the disease. If math was taught as a way to solve problems and explore the world around us — as opposed to drilling computations — students’ curiosity will be a natural source of motivation. “I want to understand” is a better driver than “I want to be a celebrity.”

The article’s secondary point, that the current curriculum sorely lacking in some departments (like statistics and finance), is something I agree with.

I ran across a neat little trick today for factoring quadratics.

Let’s start simple: factor x^2 + 11x + 24. To this, most people would start by finding factors of 24 because we need two numbers that multiply to +24 while adding to +11. Well 1,24 would sum too high and 4,6 would sum just a little too low. Nearby, 3,8 works out. x^2 + 11x + 24 = (x + 3)(x + 8).

How about factoring 6x^2 – 11x + 10? This is a bit more annoying because now we have to factor both 6 and 10, check pairwise combinations, and try both permutations of which factor of 6 multiplies which factor of 10. There are a lot more combinations to check.

However let’s transform this problem into a similar problem: factor 6(6x^2 – 11x – 10) = (6x)^2 + 11(6x) – 60 = y^2 + 11y – 60, where y = 6x. This factors to (y – 4)(y + 15) or (6x – 4)(6x + 15). However, there is an extra factor of 6 in there, which we remove to get (3x – 2)(2x + 5).

In summary, we turn an annoying problem we’d solve by trial and error to one that can be solved by less trial and error. Okay, we’re replacing factoring two small numbers with factoring one large number, but we have a heuristic to guide the latter in that the sum/difference of the two factors behaves in a simple manner.

tl;dr taking the time to turn a hard problem into an easier problem may save you time in the long run.

(This is a post aimed more at high school students or young college students.)

This is a neat mental math trick I learned from television. Let’s start easy and have one of your friends find a 4-digit perfect square (i.e. square any integer between 32 and 99, inclusive). This trick will allow you to find the square root almost immediately. For example, let’s use 3136. Call it S.

- Estimate the square root to the nearest ten. 50 squared is 2500 and 60 squared is 3600, so the square root is between 50 and 60. I’m going to pick 50 because it’s easier to divide by. Let X = 50.
- Divide S by X, ignoring the remainder. 3136 / 50 = 62-point-something. Let W = 62.
- Average X and W. That is the square root! So for our example, the average of 50 and 62 is 56. You can check that this is right.

However, your friends who are on the ball might not be that impressed since, because it is known that 3136 is a perfect square, we would have guessed that the units digits is either a 4 or a 6 since that is the only way to get a 6 in the units’ digit of 3136. So let’s up the ante: we will now compute the square root of any 4-digit number to one decimal point. For this, I asked a random number generator and got 5389.

- 70 squared is 4900, 80 squared is 6400, so we’re somewhere between there. Let X = 70, since 4900 is closer, and like most iterative algorithms, Newton’s Method works better when we start closer to the endpoint.
- S / X is almost exactly 77. Let W = 77.
- The average of X and W is 73.5. Because this method tends to overestimate, let’s call this 73.

Now let us repeat steps 2 and 3, but with X equal to the number found in step 3.

- (again): S / X is a bit trickier with X = 73. I have a mental math trick for this (which I’ll explain later): the answer is 73 + (5389 – 4900 – 420 – 9) / 73 = 73 + (489 – 420 – 9) / 73 = 73 + (69 – 9) / 73 = 73 + 60 / 73, or about 73.8.
- (again): The average of 73 and 73.8 is 73.4. To check this, the square root of 5389 is 73.4098086…

Before I explain how it works, I find the student learns more when they try it for themselves. So here are some questions:

- Show how this algorithm is derived from Newton’s method. Hint: use y(x) = x^2 – S. Why does this algorithm tend to overestimate?
- Explain the mental math trick used in step 2 (again). Hint: use algebra.

WARNING: Try the problems posted in part 1 before reading this post. It took me some minutes, but mostly because I had to rederive Newton’s Method and tried the wrong y(x) first. I find that students do not really learn when answers are just handed to them; you have do it for skills to sink in. Do you think basketball stars got that good by listening or reading about how to play? They go onto the court and practice their skills. Math is the same.

1) I have a confession: I have a really bad memory for formulas. However, this is not a problem in math if you know what you’re doing. This is the formula for Newton’s Method that is easiest for me to remember:

y(x_i) = (x_i – x_{i+1}) [dy/dx evaluated at x_i]

where x_i is the current guess (i.e. the X in Step 1) and x_{i+1} is the next guess (i.e. the number found in Step 3). This is easiest to see if I could include a diagram in this blog. Since I can’t, I’ll just have to describe it to you: imagine the diagram of Newton’s Method. We have a right triangle formed by the points (x,y) = (x_i,0), (x_i, y(x_i)), and the point where the tangent line meets the x-axis. The right angle is at (x_i,0). The slope of the tangent line is y(x_i) / (x_i – x_{i+1}), but calculus also tells us that the slope is dy/dx evaluated at x_i, hence the equation above.

dy/dx evaluated at x_i is 2x_i. Solving for x_{i+1} (I’ll spare you the algebra) gives

x_{i+1} = x_i / 2 + S / 2x_i

which is the average of x_i and S / x_i. Now why does this algorithm always overestimate? Draw y(x) = x^2 – S. With positive square roots, we’re always on the right half of the parabola. Because y(x) is concave up, the next iteration of Newton’s method is always a little higher than the actual root no matter if x_i is too high or too low. This is why we always round down.

2) If S = 5389 and X is 73, how do estimate S / X? Well think about it this way: we’re looking for some number d such that (70 + 3)(70 + 3 + d) = 5389. Expand these terms: 70^2 + 2*70*3 + 3^2 + 73d = 5389. Therefore d = (5389 – 4900 – 420 – 9) / 73. Easy peasy lemon squeezy.

If I may make a commentary, the ability to do square roots in your head will not get you a job. There’s a reason why calculators were invented. However, to many people, the fun of a magic show is trying to figure out how the magician did it. And now you know why this trick works. It’s not magic, it’s math, which is way more powerful. This is a good way of understanding Newton’s Method, which is a powerful tool which is still used today.

Q1. You’re playing Stone Age. You place 2 workers on gold, 2 on brick, and 1 on food. What is the expected amount of each resource you’ll collect?

A1. Let G, B, and F be, respectively, the number of resources we get. Let X, Y, and Z be the rolls.

- P(G=0) = P(X in {2,3,4,5}) = 10/36
- P(G=1) = P(X in {6,7,8,9,10,11}) = 25/36
- P(G=2) = P(X=12) = 1/36
- (AN: Sorry, can’t use summation notation in wordpress) E[G] = 0 P(G=0) + 1 P(G=1) + 2 P(G = 2) = 0(10/36) + 1(25/36) + 2(1/36) = 27/36 = 0.75 gold

Similarly, we find E[B] = 49/36 = 1.36 brick and E[F] = 9/6 = 1.5 food. This a simple exercise in **expectation **for discrete random variables.

Q2. You have the same situation as before, but now you have a single tool. Assume you roll for resources in order: gold, brick, food. Also assume you use the tool if it will allow you to gain an additional resource but otherwise not.

A2. Let event TG be true if we have the tool before rolling for gold, and similarly for TB and TF. (One of the problems with probability notation is using capital letter for both events and random variables. Have to think of an alternative. Maybe Roman and Greek letters?) Furthermore, let TGc be the complement of event TG, and similarly for the others. We have already calculated:

- E[G|TGc] = 3/4
- E[B|TBc] = 49/36
- E[F|TFc] = 3/2

We should recalculate the expectations given the tool (the calculations are very similar to the above, so they’re omitted):

- E[G|TG] = 33/36 = 0.92
- E[B|TB] = 59/36 = 1.63
- E[F|TF] = 2

Also, we should calculate the probability that we retain the tool after rolling for each resource:

- P(TB) = P(X not in {5,11}) = 30/36
- P(TF) = P(TF|TB) P(TB) + P(TF|TBc) P(TBc) = P(Y not in {3,7,11}) P(TB) + 0 P(TBc) = (26/36)(30/36) + 0 = 65/108

That last line comes from the **law of total probability** in **conditional probability** since {TB,TBc} is a partition of the event space. Finally, we can use the analog of the law of total probability for **condition expectation**:

- E[B] = E[B|TB] P(TB) + E[B|TBc] P(TBc) = (59/36)(5/6) + (49/36)(1/6) = 344/216 = 1.59 brick
- E[F] = E[F|TF] P(TF) + E[F|TFc] P(TFc) = (2)(65/108) + (3/2)(43/108) = 389/216 = 1.80 food

In summary, by adding 1 tool, a player in this situation can expect 0.17 extra gold, 0.23 brick, 0.30 food, which I estimate to be worth 2.6 pips that turn, and you get to use the tool every turn. This is why tools are a good investment.

Q3. Instead of having 1 tool, say you now have 5 and then you roll a 7 for gold. Do you use all 5 tools to get an extra gold, or do you save the tools for the other rolls? Because it possible to get as many as 8 pips (1-2 tools for one brick, 3-4 tools for two food). This is a trickier problem which I’ll save for Part 2.

Here’s an example of using conditional probability to solve for a result in Can’t Stop.

Q. What is the probability of rolling at least one 7?

A. Let A, B, C, and D represent the four dice rolls. Because of symmetry, it doesn’t matter what A is.

There are 3 cases to consider for B:

- B = 7 – A, a 1-in-6 probability, in which we have failed to not roll a 7.
- B = A, a 1-in-6 probability.
- Otherwise, a 4-in-6 probability.

Therefore, after 2 dice, we have 3 cases:

- 7 has been rolled: probability 1/6
- Only 1 unique number has been rolled: probability 1/6
- 2 unique numbers have been rolled (that don’t add up to 7, which I’ll leave implied): probability 2/6

Let’s add a third die:

- P(7 has been rolled after 3 dice) = P(7 has been rolled after 3 dice | 7 has been rolled after 2 dice) P(7 has been rolled after 2 dice) + P(7 has been rolled after 3 dice | 1 unique number after 2 dice) P(1 unique number after 2 dice) + P(7 has been rolled after 3 dice | 2 unique numbers after 2 dice) P(2 unique numbers after 2 dice) = (1)(1/6) + (1/6)(1/6) + (2/6)(4/6) = 15/36
- P(1 unique number after 3 dice) = P(1 unique number after 3 dice | 1 unique number after 2 dice) = (1/6)(1/6) = 1/36
- P(2 unique numbers after 3 dice) = P(2 unique numbers after 3 dice | 1 unique number after 2 dice) P(1 unique number after 2 dice) + P(2 unique numbers after 3 dice | 2 unique numbers after 2 dice) P(2 unique numbers after 2 dice) = (4/6)(1/6) + (2/6)(4/6) = 12/36
- P(3 unique numbers after 3 dice) = P(3 unique numbers after 3 dice | 2 unique numbers after 2 dice) P(2 unique numbers after 2 dice) = (2/6)(4/6) = 8/36

Sanity check: these add up to 1. Finally, the 4th die:

- P(7 has been rolled after 4 dice) = P(7 has been rolled after 4 dice | 7 has been rolled after 3 dice) P(7 has been rolled after 3 dice) + P(7 has been rolled after 4 dice | 1 unique number after 4 dice) P(1 unique number after 4 dice) + P(7 has been rolled after 4 dice | 2 unique numbers after 3 dice) P(2 unique numbers after 3 dice) + P(7 has been rolled after 4 dice | 3 unique numbers after 3 dice) P(3 unique numbers after 3 dice) = (1)(15/36) + (1/6)(1/36) + (2/6)(12/36) + (3/6)(8/36) = 139/216 = 59.7%

Of course, all this should be rewritten in the usual mathematical notation so it’s a bit easier to read.

However, I’m having trouble coming up with an elegant solution to what is the likelihood of busting if your columns are 6,7,8 or 5,7,8, etc. I think it becomes a rather tedious counting problem, not much better than just having a computer exhaustively check every possibility. Even the probability of rolling any other number (like 8 ) becomes complicated once we lose the symmetry. That may be an important lesson in itself.

Lately, I’ve been thinking about how I would teach various classes I may be called to teach. Frankly, I’m uncertain how to design a project-based course on, say, linear algebra or calculus. But for probability and statistics, I have some ideas: games. There are 2 reasons I want to use games: 1) Games are fun, which I hope will engage students. I want them to want to learn more, to explore a problem and try to solve it for themselves, something I find lacking in many math students. 2) I like games and can talk about them all day. I hope that will give my lessons that little extra fizz since I’m not the most charismatic teacher.

One excellent game that comes to mind is Rallyman. I don’t have time to summarize the rules, but it is an excellent mix of risk management and planning. Here are some ways the game can be used to illustrate various concepts:

**Intro.** If people don’t know what rally driving is, show them a video of Ken Block or something. Will this turn off people who aren’t interested in cars? Maybe. Have to work on that.

**Probability and events.** Start with a time attack roll, when all dice are rolled together. What is the probability that 3 hazard symbols come up and you lose control? Well, that depends on the mix of dice being rolled. Start with simple problems of all dice having the same probability of hazards (1-in-6 or 2-in-6), then move onto mixtures. This can lead to discussion of counting (how many ways are there to roll 2 hazards with 4 dice?) and/or making use of the probabilities of the events they just calculated.

Another concept that’s very important here is independent events: the result of one dice doesn’t affect the other dice. Furthermore, say you’re rolling 1 gas and 3 gear dice. We can create events A, one hazard is rolled on the gas die; B, 2 or more hazards are rolled on the gear dice; C, 3 hazards are rolled on the gear dice. Event A is independent of B and C, which makes it easier to calculate probabilities of intersections of events.

**Conditional probability. **I’m pretty sure there’s an example of conditional probability, but a lot of the simple events (i.e. die rolls) are independent, and the only example I can think of (how one turn impacts the ones after it) is probably better done after discussing expectation.

**Random variables and expectation. **The bottom line in Rallyman are seconds spent, which is an example of a random variable: if we make this move, we have an 80% chance of using only 10 seconds but a 20% chance of using a whole minute. From this, we can talk about expectation: on average, how many seconds will this take? (Of course, maybe taking only 10 seconds will win you the race while 60 seconds will lose it, so depending on the context, asking about expected number of seconds is the wrong question, but that will have to wait until we talk about cost functions and decision theory.) From there, we can extend this to conditional expectations: if you spin out, that screws up your next turn, which affects the number of seconds that will take.

**Big picture. **I don’t expect Rallyman to be solved by any student without a computer. I think if Rallyman were used in a classroom, its role would be to illustrate various concepts and to show sample calculations. For evaluation, students would instead use these ideas to study something else.