Power Steering Fluid

I was replacing the power steering fluid in my car when I stumbled upon some exponential decay math.  First, some background:  There is no drain plug on a power steering system.  You need to siphon out fluid from the reservoir and replace it with new fluid.  This new fluid then mixes into the rest of the system to create a slightly cleaner mixture.  This idea is that if you repeat this a few times, you’ll replace most of the old fluid with new fluid.

So, as you can see, there exists some sort of formula that can determine exactly how many times you need to extract and replace to reach X% replacement.

My car’s entire power steering system holds about 1 liter, and the reservoir itself holds .4 liters of that. So, 40% of the steering fluid is replaced each time I drain and refill the reservoir, and 60% of the old fluid remains elsewhere in the system.  I can do this repeatedly, each time replacing 40% of the “mixed” fluid with brand new fluid.

Let p = Percentage of the system replaced each time you empty the reservoir.
Let n = number of times you empty/fill the reservoir (“flush”)
Let FD = Percentage of dirty fluid in the system.
Let FN = Percentage of new fluid in the system.

If p = percentage of new fluid introduced by a flush, then (1-p) is percentage of old fluid remaining.  (eg: if p = .40, then (1-.40) = .60)

FD_0 = 1  (initial proportion that is dirty)
FD_1 = (1-p)
FD_2 = (FD_1)(1-p) = (1-p)(1-p)
FD_3 = (FD_2)(1-p) = (1-p)(1-p)(1-p)

FD_n = (FD_{n-1})(1-p) = (1-p)^n

 

FD = (1-p)^n
FN = 1 - FD

 

For my car, p=.40 so FD = (1-.40)^n

How many times do I need to empty and fill the reservoir to get to 80% clean?
Just set FN = .8 and solve for n:

.80 = 1- (1-.40)^n
.80 = 1- (.60)^n
.6^n = .2
log(.6)^n = log(.2)
n*log(.6) = log(.2)
n = \frac{log(.2)}{log(.6)}
n = 3.15

 

With a reservoir that holds 40% of capacity, I need to empty and replace it about 3 times to get 80% of the old fluid replaced.

You can also use the formula to figure out what percentage of the system contains old vs. new fluid, based on the number of flushes you’ve done.  For this, you just plug in n and calculate FD  eg:  If you’ve done 5 flushes, FD = (1-.4)^5 = .08   So, after 7 refills, 8% is dirty, and 92% is new.

Using Trigonometry to Estimate Influenza Deaths



In a nutshell, the sin() and cos() terms are periodic curves, and the weighting of the various coefficients is what allows proper regression fits.  Let’s take a closer look at the formula, and try to make sense of it.

As t (weeks) increases to 52, \frac{t}{52} goes from 0 to 1  (\frac{0}{52}\frac{1}{52}\frac{2}{52}\frac{3}{52}, …, \frac{52}{52})  Once it goes past 52, it just cycles around again.  Recall 2\pi radians = 360 degrees.  Since  \frac{t}{52} is multiplied by 2\pi, it is multiplying 360 by some number.   So, it seems the sin() and cos() terms simply use t weeks to scale across multiples of 360 degrees .

For example, as t goes from 0 to 52, \frac{t}{52} goes from 0 to 1, 2\pi * \frac{t}{52} goes from 0 and 360.  (and then it repeats since sin repeats in multiples of 2\pi and therefore sin(2\pi * \frac{t}{52}) goes from sin(0) to sin(360) which is a full periodic cycle of this function.  Note the same logic applies to the cos() term in the formula.

The picture says it all.

Further Reading:  Automated Detection of Influenza Epidemics with Hidden Markov Models

Fractals: Recursion & Iteration with Complex Numbers

 

The essence of fractal geometry lies in recursive iteration.  What’s that?  It’s just a self-referring loop. Let’s start with a simple equation:  f(x)=2x+1

Let x=0 and plug it in, and you’ll get a 1:
f(0)=2(0)+1=1
(Now, take that 1 and plug it back into the same equation)
f(1)=2(1)+1=2        (Then, take this 2 and do the same thing)
f(2)=2(2)+1=5
f(5)=2(5)+1=11
f(11)=2(11)+1=23

And so forth.  You can keep doing this forever, and notice how this list of results (0,1,2,5,11,…) will tend towards infinity (This isn’t always the case)

The mother of all fractals, the Mandelbrot Set is defined by this deceptively simple equation:  f(z)=z^2+c where c is some fixed constant.  If you do the same procedure above, you’ll get a series of numbers.  eg: Let c=5, and let’s start with z=0:

f(0)=0^2+5=5
f(5)=5^2+5=30
f(30)=30^2+5=905

 
At IBM, Benoit Mandelbrot used a complex number (a+bi) for that constant.  For example, let’s use c=1+i, as he did in his 1980 Scientific American article introducing fractals:

f(0)=0^2+(1+i)=1+i
 
f(1+i)=(1+i)^2+(1+i)=(1+i)(1+i)+(1+i)=(1+2i+i^2)+(1+i)=2i+(1+i)=1+3i
 
f(1+3i)=(1+3i)^2+(1+i)=(1+3i)(1+3i)+(1+i)=(1+6i+9i^2)+(1+i)=6i-8+(1+i)=-7+7i

 
…and so on. For the Mandelbrot set, the calculation is iterated until it’s clear whether the result is tending towards 0 or infinity. Based on this result, for every complex number, you plot a point on the complex plane either black or white.  (Or, it is colored based on how fast it tends towards infinity.)  For example, 1+i tends towards infinity when plugged into this equation, so a black point is plotted for 1+i on the complex plane. This process was then repeated for every complex number, and was only possible because of the advent of modern computers. Every single complex number gets a point (eg: .234234234 + .324325423i) The result is the deeply infinite, self-referential image you see above.  The image is much more complex than it appears.  For example, if you zoom in to a certain section, you will see the entire image repeat within itself, and then repeat within that zoom!  I can’t do it justice here, so if you want to learn how this Math models real life phenomena & situations, these 2 videos are a great primer for a layman:

Pullups and Logarithmic Decay

So, I was talking to a friend about how pullups need more rest time between sets than other exercises:

Pullups need significant recovery time to not have logarithmic decay in the number of reps you can do.

I suggested waiting about 3 minutes between sets.  Of course, the next time I did them, I wanted to see just how logarithmic the decay is.  I plugged the numbers into Excel and did a log regression.  Wow, I wasn’t kidding, look at that correlation coefficient !!

What does pi (3.14159) have to do with taking great photographs?

The lens in this photo has a focal length of 50mm.  In general, the higher the focal length, the more zoom you’ll have.  For example, a wide angle lens is between 9mm-24mm, where a zoom/telephoto lens can be 150-400m.  However, let’s focus on the list of numbers on the bottom edge of the lens:  1.4, 2.8, 4, 5.6, 8, 11, 16.   At first glance, this is a bizarre sequence of random numbers.  These numbers allow you to choose the f/stop, which is the ratio between the diameter of the lens opening (aperture) and the focal length of the lens.  For example, for an f-stop of 2 (written f/2.0) the diameter would be 25mm while the focal length is 50mm, because 25mm divides into 50mm two times.  Hence, the general equation is:  f/stop = \frac{focal\ length}{diameter}  Note that when the f/stop is low, it means the aperture is large.  (FYI, the point of a large aperture is to let in a ton of light to improve picture quality).

http://www.gettingfocus.com/

http://www.gettingfocus.com/

Next, let’s play around with some numbers and see where this leads us.  The simplest example would be to consider an f/stop ratio of 1 (written f/1.0) on a 50mm lens:  This would mean the diameter of the lens opening (aperture) is 50mm, making the radius equal to 25mm.  Remember the old circle formulas from middle school?  Well, knowing the radius, we can calculate the actual area of the lens opening at f/1.0.

http://commons.wikimedia.org/wiki/File:CIRCLE_1.svg

Area = \pi r^2

 

Circumference = 2 \pi r = \pi d

 

A = \pi r^2 = \pi (25)^2 = 1963.5

 

 

Ok, so let’s try an f/stop that is actually on the lens (f/1.4):

1.4 = \frac{50}{diameter} … (so d = 35.7 and r = 17.86)

A = \pi r^2 = \pi (17.86)^2 = 1001.8

An f/1.4 lens has an aperture area of about 1000.  Do you see any relationship between the area for f/1.0 vs. f/1.4?  If not, I plugged in the rest of the f/stop numbers printed on the lens into a spreadsheet.  What do you notice about the area of the circle for each subsequent f/stop?

In fact, the f/stop numbers that initially seemed so random actually do have a very precise relationship to each other.  The area of the circle is being approximately halved for each of these f/stops.  Conversely, for each f/stop you drop down, you are doubling the area of the lens aperture (effectively doubling the amount of light that the lens will let in!) Compare f/16 to f/1.4.  They are 7 stops apart, meaning the amount of light doubles seven times.  That means an f/1.4 lens allows 27 = 128 times as much light as the f/16 lens!!  That makes a huge impact on the kind of pictures you can take when lighting is not optimal (indoors, night, etc).  Most pocket cameras are about f/4.  Even an f/2 lens will allow 4x as much light in (2 stops down)

The older you get, the faster each year passes by. Why?

RE: Birthday Dinner
Yes, life does go by fast.  Strangely, the older you get, the faster it goes.  I do not know why this is.

Ever get an email like this?  Well, as your age varies, the percentage of your life that a single calendar year represents also varies.  As you get older, a year is a smaller percentage of your overall life.  In other words, 1 year represents 50% of a 2 year old’s life.  However, it is only 2% of a 50 year old’s life.  So, perhaps that is why each year seems to go by faster.

Want to see the percentage for every age from 0 to 80?  Yea, so do I.  Let’s make a formula and graph it.  The percentage of your life that a single year represents is just a function of your age:  f(age) = \frac{1}{age}  If you graph this on a spreadsheet, you’ll get the following:

How would you interpret this graph?  You’ll notice that once you pass the inflection point, the percentage seems to flatten out.  So, at what point can a person legitimately start saying “Wow, this year really flew by?”  Based on the graph, teenagers might feel this almost as much as middle aged people.

Lastly, do you notice how scaling of the y-axis makes the difference between age 15 and 50 look trivial?  In order to properly display percentage changes, I will scale the y-axis logarithmically.  Here is the result:

With this scaling, you can see there is, indeed, quite a difference between a teenager (~6%) vs. someone in their 50s (~2%)

 

The Wallaby That Roared Across the Wine Industry

The Wallaby That Roared Across the Wine Industry

By the end of 2001, 225,000 cases of Yellow Tail had been sold to retailers. In 2002, 1.2 million cases were sold. The figure climbed to 4.2 million in 2003 — including a million in October alone — and to 6.5 million in 2004. And, last year, sales surpassed 7.5 million — all for a wine that no one had heard of just five years earlier.

Prima facie, it looks like exponential growth.  But, in the real world, nothing ever grows exponentially in perpetuity (except college tuition, it seems)   I looked up sales figures for other years online.  Let’s plot these numbers in a spreadsheet, and see how they look.  As you can see the growth started to flatten out after a few years. I actually couldn’t find the sales data for 2008, so this calls for a statistical regression (fancy words for “line of best fit”).  A linear regression only yielded r=.88, while a 2nd degree polynomial (quadratic) regression gave an r = .96.  This regression equation is f(x) = -.13x^2 + 513x - 515887

Do you notice the negative leading coefficient of the x2 term? Remember how this makes the parabola “frown”?  Well, this “inverted parabola” shape clearly reflects the flattening of the sales growth.  

By just looking at the trendline,what’s your estimate for the number of cases sold in 2008? Or, plug 2008 into the equation to get the exact coordinates on the red trendline:  f(2008) = -.13(2008)^2 + 513(2008) - 515887

 

Why Casinos Don’t Lose Money

First, let’s illustrate the law of large numbers.  If you flipped a coin 10 times, you should expect to get 50% heads.  However, this may not happen.  You could get anything from 0 to 10 heads.  The most likely outcome is flipping 5 heads, and the odds of this outcome is 50%.  Variations from this become increasingly less probable.  For example, the odds of you flipping 4 or 6 heads is 26%, and the odds of you flipping 3 or 7 heads is 10%.  Now, let’s say you flipped a coin 100 times.  The odds of getting exactly 50% heads is still 50%.  But, do you think the odds of getting 60 heads is also 26%?  It’s actually only 2%.  It’s much easier to get 6 out of 10 heads, than it is to get 60 out of 100 heads.  What are the odds of flipping 600 heads out of 1000 flips?  It’s 0.00000001%.  With 1000 flips, you’re pretty much always going to get around 48%-52% heads.  Deviations beyond that range are very improbable.  So, in summary, the law of large numbers states that the more trials you have, the closer your actual outcome will be to the theoretical expected probability (In this case, the more coins you flip, the more you’ll start to approach actually getting 50% heads)

So, how does this tie into casinos?  Let’s take the roulette wheel as our example.  There are 37 total numbers.  18 reds, 18 blacks, and 1 green.  If you guess red or black correctly, you’ll get a 1:1 payout (ie: If you bet $1, you’ll get back $2, thereby winning $1).  If the wheel lands on green (0), both red and black lose.  This is where the casino gets it’s edge in this particular gamble.  Let’s calculate the expected value of a $1 bet on red.

E[X] = \frac{18}{37}(\$1)+\frac{18}{37}(-\$1)+\frac{1}{37}(\$-1) = -\$.03

 

What this means is you have an 18/37 chance of winning $1 (if it lands on red), and 18/37 chance of losing $1 (if it lands on black), and a 1/37 chance of losing $1 (if it lands on green)  The expected profit for playing this game is negative 2 cents.  Now, sometimes you’ll win, and sometimes you’ll lose, but if you play enough times, you’ll be averaging a loss of 23 cents per round.  This is where the law of large numbers comes into play.  As long as enough people are playing, the house will be averaging a profit of 2 cents for every dollar bet on that roulette table.

Question: What is the expected value of correctly guessing a specific number? There are 37 numbers, but the payout is 35:1 (You get paid $35 for each dollar you bet)  Based on this answer, is it smarter to try guessing the color or guessing the number?

 

The Beatles meet Mathematics & Physics

About this sound Listen to the opening chord

 

Mathematics, Physics and A Hard Day’s Night

In this article we shall use mathematics and the physics of sound to unravel one ofthe mysteries of rock ’n’ roll – how did the Beatles play the opening chord of A Hard Day’s Night? The song may never sound the same to you again.

 

I just love this paper.  Professor Jason Brown sampled the famous opening chord of this song and (using Math) separated out all the distinct frequencies in the clip.  From this, he was able to determine each individual note that was played.  From there, he determined exactly what each member of the band played.  He even discovered a surprising element relating to George Martin’s 5th Beatle status.

Professor Brown took each frequency and converted it to a musical note on the Western scale.  Here is the function he used to do this:

f(x) = 12 log_2(\frac{x}{220})    (…where 220 hertz is the frequency for A natural.)

I thought I’d try his calculation myself, because it’s a good opportunity to use the change of base formula for logarithms.  If you look in the original white paper, the first frequency in the table is 110.34   Let’s plug this into the function:

f(110.34) = 12 log_2(\frac{110.34}{220}) = 12 log_2(.5015)

How do you evaluate the above expression?  There is no log_2() button on most calculators.  Well, here’s where the change of base formula for logs comes in:  log_b(x) = \frac{log_d(x)}{log_d(b)}   So, let’s choose log_{10} (since calculators do have this) and continue:

12 log_2(.5015)=12(\frac{log_{10}.5015}{log_{10}2}) = 12(-.9957) = -11.9466

 

In other words, 110.34hz is -11.9466 semi-tones below the note of A.  It should really be 12, but as Professor Brown noted, the Beatles’ instruments weren’t in perfect tune, so the values are not whole numbers!

So, which note is 12 semi-tones below A?  Actually, 12 semi-tones makes an octave, so the answer an A note.  Using this method, he determined every note that was played.  The rest of the paper describes how he deduced which groups of notes were played by which instrument/band member.  Fascinating.

 

Does taking LSD prevent crime?

http://en.wikipedia.org/wiki/History_of_LSD

Dr. Leary began conducting experiments with psilocybin in 1960 on himself and a number of Harvard graduate students after trying hallucinogenic mushrooms used in Native American religious rituals while visiting Mexico. His group began conducting experiments on state prisoners, where they claimed a 90% success rate preventing repeat offenses. Later reexamination of Leary’s data reveals his results to be skewed, whether intentionally or not; the percent of men in the study who ended up back in prison later in life was approximately 2% lower than the usual rate.

Well, the question is this:  Was the drop from 92% down to 90% explained by random chance, or did the LSD really have a statistically significant impact on reducing the crime rate?  Since the text does not provide a sample size, I will just use n=100 to do the math.

The calculations:

H0:  LSD takers had no difference in their repeat offense rates.
HALSD takers did have a difference in their repeat offense rates.

 

First, take stock of the given information:

n = 100 \\\\ p = .92 \\\\ \hat{p}=.90

 

Next, you calculate the standard deviation of samples of this size.

SD(\hat{p})= \sqrt{\frac{(.92)(.08)}{100}}=.03

 

To determine how unlikely your sampling result was, you calculate how many standard deviations away from the expected proportion it was (Z-score).

Z(\hat{p})= \frac{\hat{p}-p_0}{SD(\hat{p})}=\frac{.90-.92}{.03}=-.67

 

Then, you calculate the odds of getting this Z-score via the normal cumulative distribution function.  (What are the odds of this happening randomly?)  If it’s under 5%, then you reject the null hypothesis, because it’s unlikely this variation can be attributed to random chance.  ie: Odds are, the hair is indeed different.

p(Z \le -.67) = .25 = 25\%

 

Conclusion:  If the odds of being a repeat offender is 92%, then the odds of having 90% (or less) repeat offenders in a random sample of 100 men is quite likely.  The math shows that the odds of this reduction simply happening by chance (random variations) is 25%.  This is large enough (over 5%), that we can not assume the LSD had any true effect on reducing crime rate.  ie:  The 2% reduction was probably due to chance.  So, we accept the null hypothesis (H0):  In a sample of 100 test subjects, the LSD had no effect if it only reduced the repeat offender rate to 90%.

 So, do you have the same lingering question that I did?  How large would the sample size have to be in order for the 2% drop to not be an accident? (Recall, I just made up n=100).  Well, some simple algebra should answer this for us:

First, let’s determine the Z-score at the 5th percentile:

invNorm(.05) = -1.64

 

Let’s use that in the Z-score calculation to figure out what standard deviation we’d need

-1.64 = \frac{.90-.92}{SD}     (…SD = .012)

 

Backing this into the SD formula will help us solve for the sample size (n)
.012= \sqrt{\frac{(.92)(.08)}{n}}    (…n = 495)

So, if Timothy Leary showed a repeat offender drop of 2% with a sample size of 495, then we could say the LSD did have an effect.  Why?  Because that much of a drop only has a 5% chance of happening randomly.

Average hours of sleep (normal distribution)

How Little Sleep Can You Get Away With?

 

Nice example of a real life phenomena closely modelling a Gaussian normal distribution.  The average hours of sleep on a weeknight (for males) was 6.9 hours with a standard deviation of 1.5 hours.  Using this data, let’s calculate what percentage of men get a good night’s sleep.  The diagram indicates 27%.

Z = \frac{8 - 6.9}{1.5} = .73

 

normalCDF(.73,99) = .23 = 23\%

 

 

The Statistics of Gaydar

The Science of Gaydar

Lippa had gathered survey data from more than 50 short-haired men and photographed their pates (women were excluded because their hairstyles, even at the pride festival, were too long for simple determination; crewcuts are the ideal Rorschach, he explains). About 23 percent had counterclockwise hair whorls. In the general population, that figure is 8 percent.

Well, just how meaningful is this 23% discrepancy from the norm of 8%?  Maybe it’s just randomness, right?  Well, a mathaholic will read something like this and try the omitted calculations for himself.  This is an example of a “hypothesis test” in Statistics.  The Null Hypothesis (H0) says that there is no difference in the groups.  The Alternative Hypothesis (HA) says there is a statistically significant difference in the groups.  In a hypothesis test, the essential question is this:  What are the odds that a sample varies this much from the expected percentage (proportion) simply due to natural random variation?  (For example, if you flip a coin 10 times, you usually get 5 heads.  Sometimes, however, you might get 6.  In fact, that should happen 26% of the time.  Nothing to be alarmed about.  However, the odds of getting 8 heads is only about 3%.  If you do get 8 heads, that’s rare enough to indicate the coin might be rigged.  Odds are you won’t do it again!)

So, for this hair test, we need to ask, “What are the odds of taking a sample of 50 guys and seeing that 23% having a counterclockwise whorl?”  We should expect to get 8%, as per the broad population.  If it’s very very rare to get 23%, then we might suspect there is a connection, and gay men do have different hair swirls than the broad population.  In Statistics, we define “very very rare” as under 5%.  In other words, if the odds that 23% of a sample of 50 have a counterclockwise whorl is under 5%, then it is statistically significant.

 

The calculations:

H0:  Gay men have no difference in their hair whorl orientation.
HA: Gay men do have a difference in their hair whorl orientation.

 

First, take stock of the given information:

n = 50 \\\\ p = .08 \\\\ \hat{p}=.23

 

Next, you calculate the standard deviation of samples of this size.

SD(\hat{p})= \sqrt{\frac{(.08)(.92)}{50}}=.04

 

To determine how unlikely your sampling result was, you calculate how many standard deviations away from the expected proportion it was (Z-score).

Z(\hat{p})= \frac{\hat{p}-p_0}{SD(\hat{p})}=\frac{.23-.08}{.04}=3.75

 

Then, you calculate the odds of getting this Z-score via the normal cumulative distribution function.  (What are the odds of this happening randomly?)  If it’s under 5%, then you reject the null hypothesis, because it’s unlikely this variation can be attributed to random chance.  ie: Odds are, the hair is indeed different.

p(Z \ge 3.75) = .000088 = 0\%

 

Conclusion:  If the odds of having counterclockwise hair whorl is 8%, then the odds of having 23% of 50 random men exhibit this trait is unlikely.  The odds of this happening by chance (random variations) is basically 0%.  So, we reject the null hypothesis (H0), and accept the alternative hypothesis (HA)

 

 

Optimal Snowboard Length Formula

http://www.livestrong.com/article/87496-size-snowboards/

Evaluate your height. This is the best way to determine snowboard length. One typical formula used by professional snowboards is: rider height (in inches) x 2.54 x 0.88 = suggested snowboard length. This will help you to start narrowing down your snowboard choices.

When I saw this formula, I wondered what it meant.  Note that snowboards are measured in centimeter units.  Well, to convert inches to centimeters, you multiply by 2.54.  So, we’re converting height to centimeters and then taking 88% of that.  I have no idea where the 88% rule comes from.  The point being, the “formula” is just saying to get a snowboard that is 88% of your height.  This lines up with my mouth, and I am pretty sure I am well-proportioned.  So, maybe it’s just easier to say that a snowboard should come up to your mouth.

This is also an example of intentionally not simplifying an expression because you lose some inherent meaning (explicit unit conversion, etc).  Otherwise, the equation could be simplified to length = 2.235 * rider height (in inches)

 

How does Netflix predict which movies you’ll like best?

The simplest way to predict your rating for a movie is simply to average everyone else’s rating of the movie.  (ie:  They can just give you the 10 movies with the highest average rating)  Of course, it can get much more complex that than, especially when NFLX was giving away a million dollars to anyone who could improve their rating algorithm!  The real meta of this problem is to determine other people who are most like you, and then use their collective ratings on movies you haven’t seen yet.

Neighborhood-based model (k-NN): The general idea is “other people who rated X similarly to you… also liked Y”.  To predict if John will like “Toxic Avenger”, first you take each of John’s existing movie ratings, and for each one (eg: “Rocky”), find the people who rated both “Rocky” & “Toxic Avenger”.  You then compare the ratings given to both movies by these people, and calculate how correlated these 2 movies are.  If it’s a strong correlation between their ratings, then “Rocky” is a strong neighbor in predicting John’s rating for “Toxic Avenger”.  You’ll weigh in the average rating given (by “Rocky raters”) to “Toxic Avenger” highly.  You do this for all the movies that John has already rated, and find each one’s strongest neighbor(s), and calculated a predicted “Rocky” rating from each movie John has already rated.  You then calculate a weighted average of all these predictions to come up with your ultimate prediction for John’s rating of “Rocky”.  Lastly, if you do this for every movie in the entire database, you can determine a “Top 10 suggestions” list for John.

 

Here is some general reading on the contest:

The BellKor solution to the Netflix Prize

This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize

The Netflix Prize: 300 Days Later

The Greater Collaborative Filtering Groupthink: KNN

 

 

How They Check if a Credit Card Is Valid

Before the era of ethernet, TCP/IP, and packet verification, electronic data was transmitted over telephone wires.  (Think back to the days of AOL & modems, and that screeching noise when you connected.)  A big problem with this method was the risk of external interference garbling your signal.  What if a bird flew into the phone wire?  Or if someone picked up the other line?   Or it started raining?  Any external interference could result in the  information being sent at that moment to be garbled.  So, if you were transmitting something like a credit card number, how would the receiver know that it wasn’t garbled?  (For example, what if a 3 was garbled into a 4?)   This was a problem solved at IBM back in the 1950s.  Of course, the same issues arise when human error is introduced.  (What if you are reading the credit card aloud over the phone, and the other person types in one of the digits incorrectly?)

For the rest of this post, I am merely introducing a lecture I attended called “Identification Numbers and Check Digit Schemes”, by Joe Kirtland.  Thanks to Joe for sharing his full PPT slides with me (link below).  It talks about the checkdigit systems used to validate credit cards, bar-codes, ISBN numbers, currency serial numbers, etc.  It’s a great real-world application of Algebra, Geometry, and algorithms.

Here is a very crude example that illustrates the concept:  Let’s say you want to transmit the number “34515″.  One validation algorithm requires that the sum of the individual digits be divisible by 10 (mod 10 = 0).  Currently, the sum of the digits is 18 (3+4+5+1+5=18)  So, to make the sum divisible by 10, you just tack on a 2 at the end , and transmit “345152″.  The receiver of the data is told to ignore that last number, which is called a check digit.  If the final number he gets doesn’t check out, something went wrong, and you need to resend.

Question:  This method is not foolproof, can you think of some reasons why?

If this topic interests you, check out the following slides, where Joe explored check digits as they relate to UPC codes, ISBN numbers, credit cards, and serial numbers on various currency.

Click here to view Joe’s full PPT lecture

 

What’s more likely to break down: a car with 200 miles or 20,000 miles?

In manufacturing/engineering, there is a concept known as the bathtub curve.  In theory, something that is brand new (including a human being!) is more likely to have failures than something that is a little older and has worked out those early kinks.  Of course, once the product gets old, you’ll start having new reasons for failure (things wearing out).

I don’t have much to add on this topic, but I created this post this for two simple reasons:

  • First, it’s a nice example of a Cartesian graph that makes intuitive sense.  The aggregate blue curve indicates that new things can be lemons, then they work smoothly, and then they wear out and start breaking again.
  • Also, I think its a great example of an authentic real-life piecewise function.   As you can see, it has three very different sections.

 

Question: In terms of cars, do you think the blue curve above would be so symmetric?

 

Projectiles: What’s the optimal angle at which to throw something? (to maximize distance)

Wow, a real life formula that uses the double angle trig. identities!  As you can see, the distance a projectile will travel is a function of:  velocity, gravity, and the launch angle.

First, a quick fraction review:  First, recall that \frac{1}{1000} is a lot smaller than \frac{1}{10}.  Conversely, we can also agree that \frac{1}{100} is a lot smaller than \frac{99}{100}.  ie: The bigger the denominator (and/or smaller the numerator), the lower the value of the (positive) fraction.

So, since v is in the numerator, the distance traveled (d) increases directly with velocity (in a big way, since it’s squared)  Next, since g is in the denominator, the distance traveled decreases as gravity increases.  (Makes sense, right?)


This is a graph of all Sin(x) values from 0 to 360.  The x-axis is divided into quadrants (0, 90, 180, 270, 360). Notice in the graph that Sin(x) rises from 0 to 1 as x rises from 0 to 90 degrees.  Then, it drops from 1 back to 0 as x rises from 90 to 180 degrees.

Refer back to the double angle Sin(2\theta) in the original formula up top.  So, as x rises from 0 to 45 degrees, 2x actually rises from 0 to 90, and the Sin(2\theta) value is increasing.  But, as x continues to rise from 45 to 90 degrees, 2x rises from 90 to 180, which means the Sin(2\theta) value is now decreasing.

So, what’s the ideal angle to throw something?  The one that maximizes the value of Sin(2\theta), since it’s a multiplier in the projectile formula.  Well, as you can see in the graph, Sin(90) = 1, the highest possible value for Sin(x).  So, the ideal launch degree is x = 45 (which puts 2x at 90).

So, now you know why the ideal angle in these video games is 45 degrees, and have an inkling of how programmers create classic games like these:

Quitting While You’re Ahead? Random Walks & Markov Chains

Question:  What if you walked up to a “fair” casino game (50/50 odds of winning) that pays even odds with $10,000, and said you would quit as soon as you’re up $1,000 ? (ie: You either walk out with $11,000 or keep playing until you lose everything)  What are the odds of you leaving the casino with a $1000 profit?

Markov Chain can virtually simulate many random walks of this experiment.  With a large enough sample size, you can get an accurate sense of the odds of walking out with $1,000.

The following code runs this simulation as many times as you want, and tells you how many times you got to $11,000.

#!/usr/local/bin/perl
#!/opt/bin/perl

use Getopt::Long;
use strict;

GetOptions("f:s", "debug", "v");

my $current_round = 0;
my $starting_amt = 10;
my $lost = 0;
my $won = 0;
my $current_amt;
my $x;
my $low_bound = 0;
my $high_bound = 11;

while ($current_round < 100) {
while ($current_amt > $low_bound && $current_amt < $high_bound) {

		#generate a 1 or 0
		$x = int(rand(1)+.5);

		#adjust to either +1 or -1
		if ($x == 0) {
			$x = -1;
		}

		# update total
		$current_amt -= $x;

		# print current total
		print "$current_amt ";

	}

	print "\n";

	#increment rounds won or lost...
	if ($current_amt == 0) {
		$lost++;
	} else {
		$won++;
	}

	#reset current amount for next round
	$current_amt = $starting_amt;

	$current_round++;
}

print "\nWon = $won\n";
print "Lost = $lost\n";

Answer:  Doing 100 rounds is a large enough sample size to get an accurate result.  On average,  you’ll walk out with $11,000 about 90% of the time you try this experiment.  Why is this still a bad idea?  In practice, gamblers rarely quit while they are ahead, and you still have that 10% odds of a total washout.  Recall, the expected value of this game is still break even.

Here is a sample output of 10 rounds:

Won = 9
Lost = 1

Continue reading