You are welcome to browse this page to see some of the other posts I have made, but for the most part I will not be using this website this year, apart from updates about the IHS Quiz Bowl team!

]]>

In class, you learned that there are different types of infinity, and while the natural numbers are “countably infinite” (as are the integers and rational numbers), the real numbers are “uncountably infinite.” See this post if you missed this explanation in class.

At each stage of the Cantor Set, we remove a portion of the interval [0,1]. The endpoints of the portions we leave behind can be mapped to the natural numbers, so in order for us to claim that the Cantor Set is uncountable, there must be *other members* of the set left over, an uncountable quantity of numbers/points that **aren’t endpoints of segments, but also are never within segments that are removed**. To see where these numbers are, we have to look at base-3 fractions.

Our number system is base 10 (or *decimal*). When we consider a number like 167, what we are describing is a quantity made up of one 100, six 10s, and seven 1s. Each place value of a number written in base-10 is a power of 10, and there are 10 possible entries for each place value. The same goes to the right of the decimal point as well: if we decrease powers from 10^2 = 100, to 10^1 = 10, to 10^0 = 1, we continue to get 10^-1 = 1/10, 10^-2 = 1/100, and so on. As a result, a fraction like 1/4 can be expressed as the decimal 0.25, which corresponds to 2*10^-1 + 5*10^-2, which is perhaps easier thought of as 2/10^1 + 2/10^2.

If we want to use base-3 fractions, then the principle is the same, but everything is in threes. The denominators of our fractions are powers of three, and we only allow three options for the numerator: 0, 1, and 2. As soon as we hit something like 3/3^2, what we actually have is 3/9 = 1/3 = 1/3^1. So there’s only room for those three digits in the numerator before any fraction reduces to a different fraction with a lower power in its denominator.

Unfortunately, perhaps, this makes fractions like 1/4 much more awkward to express. The base-3 fractional expansion of 1/4 is 0/3^1 + 2/3^2 + 0/3^3 + 2/3^4 + …

Proof:0/3^1 + 2/3^2 + 0/3^3 + 2/3^4 + … = 2/3^2 + 2/3^4 + 2/3^6 + …

=2/9 (1 + 1/3^2 + 1/3^4 + 1/3^6 + …), the latter part of which is a geometric series with a1 = 1 and r = 1/9. Thus, this infinite series equals…

=2/9 * 1/(1-1/9) = 2/9 * 9/8 = 2/8 = 1/4.

Consider some member of the Cantor Set *x*. We can express *x* as an infinite series of fractions a1/3^1 + a2/3^2 + a3/3^3 + a4/3^4 + …, where each numerator is a digit 0, 1, or 2. But what if we say no numerator can equal 1?

- If a1 is not equal to 1, then x is not in the interval [1/3, 2/3] and is therefore not removed in Stage 1
- If a2 is not equal to 1, then x is not in the interval [1/9, 2/9] nor [7/9, 8/9] (since 2/3 + 1/9 = 7/9), and is therefore not removed in Stage 2.
- And so on.

So if the numerators of *x* are nothing but 0’s and 2’s, then *x* is in no interval that is removed at any stage of the creation of the Cantor Set. This then means that *x* is a member of the Set (so, for example, the number 1/4 is a member of the Cantor Set!)

The number of ways to arrange 0’s and 2’s among the numerators of this fraction is mappable to the segment [0,1] (via a similar argument to Cantor’s Diagonal Argument), meaning the size of the Cantor Set shares cardinality with the real number segment [0,1], meaning the Cantor Set is uncountably infinite.

QED.

]]>Consider the equation *x*³ + *y**³ + z**³ = k*. Easily understood: take three integers {*x*, *y*, *z*}, cube them, and add them together. In 1955, mathematicians at the University of Cambridge asked if a set of {*x*, *y*, *z*} could be found to add to every positive integer *k* less than 100. Some were easy to find: (-5)³ + 7³ + (-6)³ = 2; 2³ + (-3)³ + 4³ = 45; 25³ + (-17)³ + (-22)³ = 64. But others proved surprisingly challenging, requiring cubes of much larger numbers in order to form (51 is the sum of the cubes of -796, 659, and 602, and the solution for 30, found only in 1999, required the cubes of 2,220,422,932, -2,218,888,517, and -283,059,965). Even more unfortunately, there appeared to be not much of a pattern in the trios of numbers that worked, and so finding new solutions mostly amounted to an enormous guess-and-check procedure. A pair of mathematicians proved in 1979 that any number that the expressions 9*n* – 4 or 9*n* + 4 evaluate to (4, 5, 13, 14, 22, 23, 31, 32, 40, 41, 49, 50, 58, 59, 67, 68, 76, 77, 85, 86, 94, and 95) could *not* be expressed as the sum of three cubes, which took out several elusive numbers, but the search wore on to finish the list. Until this year.

At the start of 2019, a sum of cubes had been found **every possible positive integer k < 100** had been found

33 = (8,866,128,975,287,528)³ + (–8,778,405,442,862,239)³ + (–2,736,111,468,807,040)³

When he found the solution, Booker said he literally jumped for joy. But his job wasn’t done! There was still one number to be solved, and he knew this task would be too large for even his university’s computer.

So he turned to MIT’s Andrew Sutherland and a worldwide computer group called Charity Engine, Members of the group from around the world run a program that uses their computers’ downtime to do data crunching, effectively donating their devices’s computing time to a variety of causes. Fans of Douglas Adam’s The Hitchiker’s Guide to the Galaxy will notice a similarity here to the story, where a computer the size of a planet is constructed to find the “Ultimate Question of Life, the Universe, and Everything,” which a previous supercomputer had identified the answer as 42.

After a combined computing time equivalent to almost 150 years, they found the answer this month:

42 = (-80,538,738,812,075,974)³ + (80,435,758,145,817,515)³ + (12,602,123,297,335,631)³

The next-lowest number to have an unknown sum of cubes is 114, and in fact there are only ten numbers less than 1000 for which such a solution is unknown.

If you’re interested in learning more about this mathematical puzzle, I’d suggest you start with this Numberphile video that Booker says was his inspiration to start on his hunt. You can see an interview with Booker after his cracking of 33 here, and a recent followup here.

]]>

First things first: why is *r* necessarily a value between 1 and -1? The math box on page 180 of your text gives an argument for this tied to *r ^{2}. *But here’s another argument.

Recall the alternative definition for *r *given in the previous post:

where the numerator represents the **covariance** of the X and Y variables, given by

Recall that intuitively, the covariance of two variables is the measure of the *joint variability *of the two variables. In general, if X and Y are both above and below average together, then the covariance (and by extension the correlation) is positive. Inversely, if X tends to be above average when Y tends to be below, and vice versa, the covariance is negative. Finally, though, if X and Y are perfectly uncorrelated (i.e, independent), then their covariance is 0, since necessarily there is no consistency to when each variable is above and/or below average, and so these quantities will cancel.

So by that argument, |*r*| > 0. To show that |*r*| __<__ 1, we must use something called the Cauchy-Schwarz Inequality. This inequality shows that, in essence, the square of the sum of a series of products is necessarily less than or equal to the product of the sums of the squares of the numbers being multiplied. In symbols:

Consider again formula 1 above. The standard deviation of a variable is the square root of its variance, so another way of writing formula 1 would be:

And therefore:

Hence, 0 < |*r*| < 1, meaning *r* is necessarily bound between 1 and -1.

Finally, recall that our originally calculated equation for the line of best fit for data, based on the z-scores of our data, was *ẑ _{y} = r*z_{x}* (the proof for this is in the textbook on page 180!). If the relationship between

]]>

In other words, for each point of a scatterplot, find the z-score for the x-coordinate and the y-coordinate of that point and multiply those together. Do this for all of the points in your scatterplot, add them together, and divide by *n*-1 to get your correlation coefficient.

We discussed various properties of this quantity, and my student asked me that question that teachers always hope for (if not without a bit of dread sometimes!): “Why?” Why does this formula produce a quantity that measures the strength of a linear association? Also, why must the value of *r* necessarily be bound between -1 and 1? In this post, I seek to start an answer to these questions.

The correlation coefficient that we use in AP Statistics is actually something called the PCC or Pearson Correlation Coefficient (or, if you **really** want to impress your friends, the “Pearson Product-Moment Correlation Coefficient” or PPMCC). The formula for this value can be expressed in a number of different ways, and the version above is probably the most succinct. But probably most conventionally, the formula for the PCC of a sample of data is generally given as:

where the values in the denominators represent the standard deviations of the variables on the x- and y-axis, and cov(X,Y) represents the **covariance** of the two variables. The covariance of a sample is a measure of the joint variability of the two variables in that sample, and is found with the formula:

In general, if you find that the values of your explanatory (*x-*axis) and response (*y*-axis) variables both tend to be above average and below average for the same cases, your covariance will wind up being positive. If your explanatory is above average when your response is below average and vice-versa, the covariance will be negative.

For example, on days when temperatures in Ithaca are above the yearly average, temperatures in Cape Town, South Africa are below the yearly average, since summer in Ithaca corresponds to winter in Cape Town. Take a sample of temperatures over the whole year, and that sample will have a negative covariance. Similarly, the covariance of a sample of temperature data taken from Ithaca and Zurich, Switzerland will be positive, as those two cities see summer and winter in the same months. The **magnitude** of the covariance might be more extreme for one pair or the other — perhaps temperatures in Zurich vary more widely than temperatures in Cape Town — but the **sign** of the covariance should be clear.

The unit of the covariance is the product of the units that the explanatory and response variables are measured in and its magnitude is completely dependent on how much variability occurs in the those variables. Imagine you’re trying to measure the strengths of associations between fuel efficiency for cars versus other various measurements. Compare fuel efficiency to the weight of the car, and you’ll get “Pound-MPGs.” Compare fuel efficiency to tire pressure, and you’ll get “PSI-MPGs.” Since the weights of cars are larger numbers, measured in thousands of pounds, and the tire pressure of cars tend to be numbers between 20 and 40 PSI, the magnitude of the covariances for these two associations will be wildly different from each other. So how do we assess which comparison produces a stronger association?

This is what the **correlation coefficient** does. By dividing the covariance by the product of the standard deviations as we saw in *Formula 2* above, we are, for one, dividing by a quantity also measured in the product of units for the explanatory and response variables, thus giving us a unitless (“dimensionless“) quantity. But more importantly, this action serves to normalize the covariance by including an accounting of the magnitude of the variability for each individual variable in tandem with the amount of joint variability between both. Thus, we have a value that is more easily compared for multiple different associations, and one that we can more easily evaluate on its own.

But why is the correlation coefficient bound between -1 and 1? And why does a value of *r* near 0 indicate a weak association while a value near one of the extremes indicate a strong association? Please read on!

Statisticians find meaning from data. If you can do this well, you can be paid handsomely for it. Those of you who are considering careers should give the field some considerable thought!

]]>The thing of it is, he makes some really good points with this post. A lot of the things we learn in high school took mathematicians **centuries** to come to terms with. The concept of a complex or imaginary number, *i* = sqrt(-1), wasn’t really accepted until the 18th century (why do you think they’re called “imaginary numbers,” after all?). So don’t dispair if it takes you a little while to understand something in math class. Mathematicians of old **died **before they could come to terms with it!

Every so often, the news media becomes all abuzz when a particular lottery jackpot starts to grow really large. Right now is one of those times, with no winner on Saturday putting the jackpot for Wednesday’s drawing at around $1.3 Billion, the largest lottery jackpot in US History.

My students sometimes ask me, as a math teacher and a guy who “knows numbers,” whether I play the lottery. Usually I just smile and tell them I buy the occasional scratch ticket for the fun of it, but almost never anything beyond that. It would require a “special occasion” or a “huge jackpot” for me to consider buying one.

This certainly seems like one of those special occasions.

To understand how to approach this question from a math standpoint, we first need to understand the probability of winning.

The Powerball is a multi-state lottery game run in 44 states as well as Washington, DC; Puerto Rico; and the US Virgin Islands. Only Alabama, Alaska, Hawaii, Mississippi, Nevada, and Utah don’t run the lottery in their state.

Purchasing a $2 ticket involves choosing six numbers, five from a set of 1-69 and 1 from a set of 1-26 (called the Powerball). Only if all six of your numbers come up will you be able to win the full Jackpot prize, though there are lesser prizes for matching smaller quantities of numbers. Helpfully, Powerball has a list of prizes and odds on their website, but let’s take a moment to understand how they are calculated

On the high end, the jackpot requires your five numbers, plus the Powerball, to match those selected on the drawing. Assuming that the drawing is done completely at random, then every possible combination of these 5+1 numbers have the same chance of being chosen. So how many combinations are there?

For the first number you pick, there are 69 choices. You can’t pick the same number again, so there are 68 choices for your second number, then 67 choices for the third, 66 choices for the second, and 65 choices for the last. This would suggest that there are 69*68*67*66*65 = 1,348,621,560 ways to pick your first five numbers, and that gets multiplied by the 26 choices for the Powerball to give an overall number of possibilities of 35,064,160,560.

But that number isn’t correct. The calculation above is known as a **permutation**, which is a way of counting outcomes assuming that different orders of the same numbers are considered different. Permutations are useful when analyzing how many batting lineups of nine players from a team of 37 are possible, or when asking how many ways you can arrange your five family photographs on a shelf. But that’s not what we need here, because the order that the numbers are selected in is irrelevant. What we need to use is a **combination**, which counts outcomes in a similar way, but treats every different arrangement of the same five numbers as the same outcome.

To get a combination, we divide the 1,348,621,560 figure above by 120, the number of ways to arrange five numbers. This gives a total number of outcomes for the first five numbers as 11,238,513, and a total number of possible tickets including the Powerball as 292,201,338. So the probability that you’ll win jackpot is 1 in 292,201,338, which you’ll notice is the exact probability listed on their website. This probability is ridiculously small. Play 50 tickets a day every day, and this probability suggests you will still only win once every 16,000 years (and it’s actually worse than that, since the Powerball drawings only happen twice a week).

Incidentally, this also means that buying all possible ticket combinations – and therefore guaranteeing that you have the winner – would cost $584,402,676. In 1992, an Australian investment firm attempted to do this in the Virginia state lottery, buying 5 million of the possible 7 million combinations. This would still be a bad play, however. You might need to split the pot with another winner or winners, and lottery winnings are federally taxed by up to 25%.

But the Jackpot isn’t the only prize. Look at the bottom end. You win $4 if you only match the red Powerball, which as we said you have a 1 in 26 chance of doing. So why are the odds listed as 1 in 38.32? As the FAQ say, that figure is not **just** the probability of matching the red Powerball, but the probability of matching the red Powerball **and none of the other numbers**. Match at least one other number and your payout is different. To completely miss all five other numbers is a 7624512/11238513, or about 68% chance. Multiply that probability by the 1/26 chance of matching the Powerball, and you’ll get the 1 in 38.32 probability they have listed there.

Now that we have a sense for the probability winning, what do we do that information? How can we break this down and understand the statistical payoff for playing this game?

Consider again buying every possible ticket. One of them is guaranteed to be the jackpot, which right now is projected to be $1.3 Billion. Another 25 of them will have all five numbers match, but not the Powerball, winning the $1,000,000 second place prize. Increasingly more will win the lower prizes, with 7,624,512 having the correct Powerball, but none of the other numbers matching, to win the lowest prize (and, for what its worth, more than 280 million worthless scraps of paper)

If we took the total amount of money won across all winning tickets, subtracted the $2 cost for all 292 million tickets purchased, and divided by the number of tickets purchased, we’d get an average payout per ticket bought. This average is called the expected value and is a reasonably good measure of the value of playing a game (ignoring the business about sharing jackpots and taxes and all of that). We can more conveniently calculate the expected value by merely multiplying each outcome by its respective probability. The table below shows just that, also adjusting each prize for the $2 cost for the ticket.

This looks great! The expected value per ticket is $2.77, suggesting a positive outcome. But remember, we’re ignoring a lot here, taxes and the possibility of splitting the pot. One thing not even mentioned: if you want all $1.3 billion, you will have to wait over a period of 30 years, as the jackpot is only paid out over the long term. If you want the whole amount as cash up front, the actual payout is $806 million. How does that affect the expected value?

The expected value is still positive, but now less than half of what it was before. But this still isn’t taking taxes into consideration. USA Mega Jackpot Analysis breaks down tax rules for lottery winnings state-by-state, and New York has the highest state tax rate of any other state at 8.82%. It projects the after-tax total take-home amount of the lump-sum payout to be $533,410,800. All the other lesser prizes will be similarly affected. How does that affect our expected value?

As you can see, even after taking the lesser lump-sum value of the prize instead of the higher long-term payout, and after factoring in taxes, the expected value is still positive (though just barely).

In short, yes. For perhaps the first time in the history of the lottery, a single-winner would statistically expect a positive return on their “investment”

Then again, If you have to split the pot, the top jackpot value will get cut into equally sized pieces, which will send the expected value into the negatives (-$0.85 if split two ways, -$1.15 if split three ways, -$1.30 if split four ways). On the other hand, the value of the jackpot does depend on how many tickets have been bought, and with all the media coverage this record-breaking jackpot is getting, it wouldn’t be surprising to see the jackpot push closer to $1.4 billion by Wednesday.

I’ll see you in line for a ticket.

]]>The gist is this: Say you need to have a major operation done and there are two hospitals in your town where you could have it. You’re worried about post-surgery complications, so you do some research into the hospitals and find that in the past year, patients at the larger hospital suffered post-surgery complications in 130 out of 1000 cases, and patients at the smaller hospital suffered complications in only 30 out of 300. Based on these results, it looks like the smaller hospital is the better bet: only 10% of patients had complications after surgery there versus 13% at the larger hospital.

However, not all surgeries have the same rate of complications. Relatively minor surgeries are less invasive and would probably result in a lower complication rate. With that in mind, you look further at the data and find that, at the large hospital, 120 out of the 800 major surgery patients experienced complications compared to 10 out of 200 minor surgery patients, and at the small hospital, 10 of the 50 major surgery patients suffered complications compared to 20 out of 250 minor surgery patients. In other words, broken down by type of surgery, the complication rates at the large hospital were 15%/5% for major/minor surgeries while the small hospital saw a rates of 20%/8%. We see now that the larger hospital has a lower rate of complication across the board, regardless of the type of procedure done.

So why the different conclusion? It has to do with **how many** of both types of procedures the hospitals did. The vast majority of the larger hospital’s 1000 surgeries in the last year were major surgeries, which have higher complication rates across the board. The majority of the smaller hospital’s 300 surgeries were more minor procedures, which generally have lower rates of complication. As a result of this imbalance, the overall, pooled complication rates for the two hospitals are biased: the larger hospital towards a higher rate and the smaller hospital towards a lower rate. So it only **appears** that the smaller hospital has a lower complication rate because most of the surgeries performed there are less likely to have complications.

Check out this website for another explanation of Simpson’s Paradox, as well as some clever interactive animations that demonstrate how and why it can arise. It’s an important lesson as consumers of data and statistics: while the saying may go “Less is More,” when it comes to how much detail to include in your research, sometimes less is wrong.

Update: It appears that the above VUDLab link is dead, which is too bad. Instead, you could check out this Towards Data Science article or this MinutePhysics YouTube video for some more information.

]]>- Sweet Number Pi – Pi music video
- One Million Digits of Pi – can you memorize them all?
- Official Guinness World Record for Most Memorized Digits of Pi – the record is 67,890 places!
- Search for Your Birthday in Pi – mine starts at the 2,373,070th decimal place!
- The Tau Manifesto – Pi is probably not correctly defined and should be twice the value as it is. Many mathematicians call this number “tau” and there is a convincing argument to be made about their point!
- Other Pi Day websites – PiDay.org, PiZone.com