More Median Price Musings

Last week Tim had a great post on median prices, where he looked at sales by area of the county, and showed that sales had remained strong in the high end locations while they had dropped away in the lower-priced areas. It was great analysis, and it got the wheels turning in my head – with average price data at the right level, I knew that we could really show how the distribution of sales prices was changing. Of course, the MLS likes to make this data as difficult as possible to come by, so I wasn’t able to dig any deeper.

Then while reading a blog this morning I found this link for MelissaDATA – which is the company that sells your name to all kinds of people when you buy a new house. One of the other things they do to tempt buyers of that data is show the number of sales and average price per sale by zip code. They had data back to 2001.

After a whole lot of point, click, cut and paste activity – I now have a distribution that shows the average price of a home sale and the number of sales by zip code for all of King County by month for the last 6 years. With about 75 different zip codes to report, I was able to put together a histogram that I think should approximate the sales price distribution in the overall market.

This chart pretty clearly shows that the sales volume has dropped year over year in the lower priced zip-codes while the high-end “tail” of the distribution has remained fairly consistent. In comparing 2007 to 2005 and 2006, the shape of the distribution appears flatter and more skewed to the right.

More support for Tim’s contention that the median price doesn’t tell the whole story.


Here’s another look at the data based on JP’s excellent suggestion in the comments. In order to normalize away the shift in the median and just focus on the shift in the volume by neighborhood, I assigned each zip code to a decile based on their 2005 average price and sales volume – such that each decile contained about 10% of the sales volume for that period (not exactly as you can see, because the sales are grouped by zip code so they are kind of “lumpy”). I then compared the distribution of sales in 2005 to the distribution in 2007 for these decile assignments. I also reduced the data set to just the last two months, as I wanted to focus on what the impact of the subprime fallout has been. As you can see in the chart below, the shift isn’t huge – but the percent of total sales in the top five deciles has increased as a percent of the total market. In 2005 the top five deciles was 52.4% of total sales. In 2007, this has increased to 55.6% of total sales. Put in context of total sales, the volume in the top five deciles has dropped 55% (from ~6,400 to ~2,900 homes) while the volume in the bottom five has dropped 60% (from ~5,900 to ~2,300 homes).

As JP points out, this does assume that the “mix” of homes by zip code remains constant over time (e.g. Medina remains relatively high while White Center remains relatively low) but given there is only a two year span between the data sets, I think this should be a safe assumption.

Thanks for the comments and suggestions. Keep them coming!

0.00 avg. rating (0% score) - 0 votes


  1. 1
    MisterBubble says:

    This chart pretty clearly shows that the sales volume has dropped year over year in the lower priced zip-codes while the high-end “tail” of the distribution has remained fairly consistent.

    I’m not so sure about that….from this chart, there’s no way to tell if the change in price distribution is a function of increasing home prices (overall), or declining sales in cheaper neighborhoods.

    It does appear that there have been fewer sales between $200k and $500k in 2007 than in 2005-2006, but then again, 2007 is only half-way finished. 2005 and 2006 are comprable in this range — with a notable $50k shift in almost all prices.

    Since you have this data for many zip codes, it might help to make a series of similar plots, where the lines represent different neighborhoods.

  2. 2
    biliruben says:

    Nice work. Maybe adjust by average annual appreciation, to make it easier to compare the distribution across years.

  3. 3
    MrFish says:

    We’ve been looking to upgrade from a Ballard condo to a home for the last year, and have been focusing on Renton Highlands. Last spring, we couldn’t find anything below $525k that we would be willing to live in. In the last month, we’ve found many places that meet our criteria for around $425.

    What’s more, immediately after the latest lending rules went into effect, there has been a sudden change in the amount of houses on our Redfin favorites list that have sold. In fact, only a handful have gone to “Contingent”. The email updates from Redfin mostly only consist of new listings and $20k price drops.

    I know it’s only anecdotal, but I think the deflation of the bubble has finally begun in earnest.

  4. 4
    MM says:

    I agree with MisterBubble. This data does not support the mix shift — you can get the same chart if the price for every zip increases at the same speed.

    Can you share the Excel so we can see the price change in each zip code? If we can prove that the price does not move within each zip code it will be more convincing.

  5. 5
    IAmCornholio says:

    This blurb from the WSJ (sorry, subscription required):

    An auction of about 135 foreclosed homes in San Diego Saturday provided more sobering news for mortgage lenders. Ramsey Su, an investor and former real-estate broker who attended, calculated that the high bids for the homes averaged 67% of the prices they fetched when they were last sold, mostly in 2004 or 2005.

  6. 6
    rose-colored-coolaid says:

    I think this is very informative. Remember, there are mean, median, and mode averages. This graph gives an excellent view of the mode. Notice that in all three years, the mode is in the lower end. This tells you cheaper houses are purchased more than expensive houses.

    In 2005, mode is around $275k. In 2006, mode is about $350k. But in 2007, the mode actually falls to about $325k. Also, the number of sales at that most popular point drops in almost halve from 2006 to 2007. It looks to me like very strong evidence that purchases of ‘starter homes’ are in massive decline, and it even causes me to question if starter home valuations are already in decline.

    So what say others? Is mode an interesting average to investigate? It would be intriguing to compare historical modal price against Case-Shiller and see if those values correlate strongly. I suspect they will more consistently correltate than Case-Shiller and median do.

  7. 7
    Tom says:

    deejayoh, that is a great chart (although I’d use straight lines between data points, instead of the smoothed lines which makes it appear as if there’s more data than there really is). Thanks for the hard work. It very clearly shows (MisterBubble, you can eyeball the average from the shape of the curve) lower sales in the low end and prices creeping up at the high end.

    What would be most interesting is to see a similar chart for a city that’s already in a bust (San Diego, Boston, whatever) from both before and after prices started going down.

  8. 8
    MisterBubble says:

    MisterBubble, you can eyeball the average from the shape of the curve

    Yes, you can, but that wasn’t my point.

    My point is this: given these plots, you can’t tell the difference between price increases due to appreciation, and price increases due to a drop-off in low-end sales. All you know is that more homes sold for more money in 2006 than did in 2005.

    This plot does not show that “sales volume has dropped year over year in the lower priced zip-codes”, which is what DJO wrote at the bottom of the post. At least, not as far as I can tell from the labels and the axes….

  9. 9
    deejayoh says:

    Mr B –
    The height of the curves indicates the numbers of sales in each price range. The area under each curve is the total sales in the first 7 months of each year. So my read is that the height is lower at the left (low) end – while it has pretty much remained the same at the right (high) end. That is the drop I was referring to. Since each data point reflects average sales price for an individual zip code, the points on the left side of the curve should be by definition the lower-priced zip codes. I’ll dig into the data to find the big drops in volume to confirm.

  10. 10
    JP says:

    I agree with MisterBubble – with the plot, it is not possible to distinguish between appreciation and a change in the balance of low-end/high-end sales. In either case, the “mass” of the plot will shift to the right, but there’s not a way to tell why the mass of sales has shifted.

    I have a suggestion that might give some insight into whether the quality of housing that’s selling has changed over time across the region.

    Using the 2005 average price per zip code, assign each zip code to a bin. Generate the same histograms above, but create the 2006 and 2007 histograms by assigning the 2006/2007 sales to bins according to their 2005 price.

    In this approach we use the 2005 average price to classify a zip code as low/medium/high-end. Then the 2005/2006/2007 histograms may give a sense of how the mix of housing sold may be changing over time. In this case, a shift of the “mass” of the plot would seem to suggest a change in the mix of housing sold and the effect of appreciation on the histogram is negated.

    This approach assumes that the low-end/high-end nature of the housing stock in a zip code is uniform and does not change over time. Both assumptions may be false. ;)

  11. 11
    deejayoh says:

    JP –
    Good suggestion. I’ll use 2005 to assign zip codes to deciles and recalculate the chart. Look for an update soon.

  12. 12
    softwarengineer says:


    Real statistics don’t lie but MSM liars totally twist statistics in Seattle .

  13. 13
    DG says:

    I think an analysis of the median inventory price listings would be telling. If the mix is really a strong variable in the median house price, then the median or average price for the available inventory should be dropping (i.e. less of higher price stuff and more lower priced inventory). While this is not exact and somewhat overstated due to not including final prices, I think the result would yield the answer. I’d also throw out the top and bottom 10 listings to normalize the data. If i was a better number cruncher, I’d do it myself.

  14. 14
    CKT says:


    Great work! please check your messages in the forum.


  15. 15
    patient says:

    Very nice. It might be easier to see DJs point with a pure relative histogram where the curves are aligned at the first “bump”. It would more cleary show the change in distribution.

  16. 16
    tacoma guy says:

    (Was just trolling, and just had to comment after looking at this graph)
    That updated graph makes somebody’s point very clear.
    But your method is too hard for me to figure out so I used a way that required no math skills.
    instead of using zip codes, I used the new irregular mapping feature in the mls to draw out a neighborhood of “starter homes” I know very well along geographic boundaries where all the homes are about the same value, then restricted the search in that area to 3 bedrooms in the 1200-1800 size, then ran the CMA feature to find the median for each month, once set up just changed the month, Jan, Feb, April…. And so on.
    I found a small increase in the median price in June followed be a drop in the median starting in July, I wonder how august will turn out, but should wait until the end of the month BC most closing are between the 20th and 30th of each month.
    (And now my theory on the data)
    July was the first month to show the Sub-prime crunch by a lower median selling price; although April and May were the months that Sub prime programs where first cut, most lenders were given a last chance to lock their sub prime files in may, with a 30+ day lock, so lenders slammed all their loans in for a June closing, so that accounts for a slight increase in June, then a drop in July, a few stragglers may have closed in July, so august will be the first month to feel the full effect of the missing sub prime buyers, then give sellers about 2 to 3 months to reduce their prices, that should show in October-November this year, and this is only the “starter home market” after a lag time of about 3-6 months those in the “higher-end markets” in spring of 2008 will have to drop, starter homes need to be sold so others can move to the head of the house class, or….. pyramid.
    But I hope none of this happens, somebody prove me wrong!!! Please

  17. 17
    JP says:

    Deejayoh –

    I like the way you broke down and presented the data using deciles. I think it makes a more clear and precise presentation than a histogram.

    This feels to me like solid evidence that the distribution of sale prices is changing.

  18. 18
    Jeff says:

    I checked data for zip code 98021 (Bothell)

    The database shows 36 sales for the ENTIRE YEAR of 2005, then 246 sales for just the month of January of 2006.

    Obviously, the data is incomplete or incorrect.

    Garbage in garbage out.

  19. 19
    deejayoh says:

    a) 98021 is not in King County. Check the King County zip code map it doesn’t exist. Other sites list it as Snoho
    b) I picked up 75 zip codes. They are all in King County. I didn’t see any with that type of problem. A few weird spots, but nothing that seemed wildly inconsistent with what I know of the NWMLS Data. I’m more than happy to share the whole spreadsheet if you want to check the math
    c) your comment is not terribly constructive. Do you disagree with the conclusions, or were your fingers just itchy?

  20. 20
    Jeff says:

    It’s based on a database that is incomplete.

    If you find a note on their website that states that the data includes EVERY sale in King county then you have something. I doubt they make any guarantee that it is complete. It could very well be that it was easy for them to collect data of a particular price range (hypothetical case). Their customers are just looking for a long list of addresses, not “completeness”

    Just because you don’t see a similar problem in the zip codes that you used doesn’t mean they are complete or even accurate.

    In science, statistical analysis is subject to peer review. When a problem is found a conscientious author interested in the truth would find it “constructive”.

    It seems likely that your charts qualitatively reflect reality. But with suspect data you can’t be certain. Nothing personal.


  21. 21
    deejayoh says:

    Jeff –
    I agree the data is not comprehensive. You’ll note that I said I think it approximates the distribution. I went back and checked the year end MLS recaps, and it seems to be a subset of about 40% of the closed sales. However, it is reasonably consistent with the MLS data in terms of price trends, seasonality, growth over time.

    So, if you’re willing to treat it as kind of a random sample off which to base some analysis, then you can play around and maybe gain some interesting insights.

    If you’re not – then you’re left to believe what you read in the paper, or glean what you can from what the NWMLS is willing to share (which ain’t much!).

    Either way, at least you didn’t have to pay for it ;^)

  22. 22
    patient says:

    It’s seems like the Seattle re-industries crown jewel is the Eastside. The comments about Seattle being special is multiplied regarding the Eastside. Isn’t this an interresting target for a possible myth debuncting by SBB? The commonly accepeted view is that the Eastside homes are “move-up” homes that are purchased with boat loads of cash and thereby will hardly be impacted by the current turmoil in the mortgage market. Driving around on the Eastside however I do not get the impression that the median home would be a “move-up” home for the affluent. Those homes are in majority west of I-405 but the main part of the East-side is east of I-405 where the majority of houses more look like first time buyer homes to me. It would surely be interresting to know what the median home on the Eastside looks like. How big, how old, how many garages etc and see if it is a “move-up” Mcmansion or not.

  23. 23
    softwarengineer says:


    Just a quick comment from a Seattlite that’s been here since birth and seen it all. Bellevue, the jewel of low income [yes, Bellevue was originally built in the 50s for low incomes] flats is a bit better than Seattle’s tiny $600K variety….but when you say “moving on up homes” do you mean $400K apartments/condos? LOL

  24. 24
    patient says:

    softwarengineer, I was thinking of SFHs. Mainly becuase the Eastside seems to be more of a family oriented SFH area and also since I would think SFHs are the most common starter homes for the suburbs.

  25. 25
    Garth says:

    Countrywide may be bought out, but they are going to remain in business.

    Countrywide took 1 billion in subprime loans off the market as they would not accept 80% book value for the loans.

    It seems like the subprime CDO’s that have been sold went for about 90% of their book value. 10-20% reductions in the book value on the highest risk mortgage assets does not lead me to believe we are in for a prononunced crash.

    The question I have is when will the computerized risk models used exensively by hedge funds going to be updated and trusted again. Without those models only very traditional insurable mortgages can be resold easily.

  26. 26
    TJ_98370 says:

    ….The question I have is when will the computerized risk models used exensively by hedge funds going to be updated and trusted again….

    Another good question – When will anybody buy the equity tranche of a CDO again?

  27. 27
    deejayoh says:

    I just saw this article from last week’s SF Chronicle, where an analyst from DataQuick makes the same assertion about median prices being driven by the mix (and he has the raw data!)

    Home prices rise in July even as sales fall to 12-year low

    A total of 4,990 existing single-family homes changed hands in the nine-county Bay Area in July, according to DataQuick Information Systems of La Jolla (San Diego County). That was down 13 percent from 5,721 home sales in July 2006.

    The median price was $738,500, up 7 percent from $688,955 a year ago.

    The median rose because a greater proportion of expensive homes were sold, said Andrew LePage, an analyst at DataQuick. Tighter lending standards after the subprime loan debacle have knocked many entry-level buyers out of the market because they lack down payments, good credit or solid income proof. “If you yank out a bunch of low-cost sales, then guess what happens to the median?” he said.

Leave a Reply

Use your email address to sign up with Gravatar for a custom avatar.
Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Please read the rules before posting a comment.