24 September 2012

*"You'll love this expensive neighborhood. It has great schools" - Every Realtor, ever*

Maybe.

This is the second blog post analyzing the quality of schools using data. In the last post we identified a key success metric and saw the quality of schools. Now we'll ask: Are good schools always in expensive neighborhoods?

**WARNING:** Correlation != causation. We'll see correlations between housing prices and school quality. This does not mean school quality changes *because* of the price of a house. Children don't become better students because their bedroom has fancier paint.

When parents buy a house, a 'good neighborhood' means 'near good schools'. Parents will spend as much as they can afford to be near a good school. Let's see if it's a good idea to spend more; below is a graph of schools' High Achiever % compared with house prices.

There is a rough trend where school quality increases as house prices increase. Let's see how much of a trend there is. A little linear math (least squares' regression) produces this equation.

```
High Achiever % = Median House Price * 0.0000002439 - 0.0274391
R^2 = 0.259, SSE = 0.818, MSE = 0.00223, p-value = < 0.0001
```

An R² value of 1 means two variables are perfectly correlated. An R² value of 0 means no correlation. Here there is a correlation of 0.259. The p-value is very small (*less than 0.05*), meaning that the linear regression is probably significant. So, we're not grasping at straws.

We're interested in schools with a high High Achiever % that is not due to house prices. This is also known as the *residual. *

```
High Achiever % Not Due To House Prices =
High Achiever % - High Achiever % Due to House Prices (equation)
```

This is the same data as above, except without the influence of house prices. Here we can see which schools are good deals; they have positive values.

For example, Mercer Island high school has a 16% High Achiever value, which is quite good. However, it is a *very* low score for the cost, because the median house costs over $1 million. Therefore its residual value is negative (it's not worth the price). In contrast, Friday Harbor high school has a 27% High Achiever value, with a median house price of around $360K. It's a far better deal.

Parents could use this data to make informed decisions about where to move. For example, we can see that houses over $400K don't provide much (if any) additional improvement in school quality.

What if we want to rent, instead of buying? Let's look at school quality compared to rent price:

There is a correlation between rental prices and school quality. More linear math, and we find:

```
High Achiever % = Average Rent * 0.0001197 - 0.0684526
R^2 = 0.247, SSE = 0.724, MSE = 0.00282, p-value = < 0.0001
```

There is still a correlation, but it's weaker compared to house prices (an R² value of .247 instead of .259). Still, we can use the linear equation to plot the residual, looking for good deals:

The correlation (R²) between rent and school quality is weaker than between house prices and school quality. There are more potential deals when renting instead of buying.

**The Takeaway**: Don't assume schools are better because they're in a pricier neighborhood; that's not always true. And consider renting instead of buying.

Sometimes we don't have to dig too deeply. This basic level of analysis is sufficient for parents on a budget.

However, there is a *lot* more insight in this data. In the next post we will look at which factors influence school quality. Stay tuned for more data!