Data Collection Done Right (or How Google Takeout Wasted My Time)
Good data analysis starts with good data collection. I needed reliable, consistent data on my furnace’s actual performance. I assumed this would be simple. It was not.
Logging Furnace Runtime Manually
The first thing I did was track my furnace’s runtime day by day using my Nest thermostat. Nest provides historical runtime data, but not in a convenient way—so I reviewed it manually and recorded the number of hours my furnace ran each day in Excel. It wasn’t glamorous, but it worked.
Next, I needed temperature data. I turned to Weather Underground, pulling the daily average temperature and cloud cover conditions for my location. This also went into my Excel sheet, creating a running dataset that would help me correlate temperature changes with my furnace’s energy use.
Since I ultimately wanted to compare my gas furnace to a hypothetical heat pump, I converted my therms of gas usage into kilowatt-hours (kWh). This way, I’d have an apples-to-apples comparison between my furnace’s energy demands and what an electric system would require.
This process took time. A lot of time. I did this day by day for years to ensure I had enough data to find meaningful patterns.
The Google Takeout Debacle
At some point, I realized that Nest is owned by Google, and surely Google Takeout—Google’s data export tool—would let me grab my thermostat’s historical data in a structured, machine-readable format. I figured this would let me analyze more nuanced trends, such as runtime per call for heat, without spending hours manually logging it myself.
Nope.
The Takeout data was a mess. I expected a neat dataset; what I got was shockingly incomplete and riddled with massive amounts of duplicate data. Data points repeated, timestamps were unreliable, and entire sections of runtime history were missing. What should have been a shortcut turned into an hours-long exercise in frustration.
I tried sifting through it, hoping I could clean it up, but the more I dug, the clearer it became: this wasn’t just messy data—it was junk data. There was no clear logic to how the duplicates appeared, and no guarantee that any given entry was accurate. After a lot of wasted effort, I had to scrap the Takeout data entirely and rely on my own manually collected observations.
Lesson Learned: Trust but Verify
I had assumed that a major tech company would provide clean, reliable data through its own export tool. I was wrong. Instead, I learned an important lesson: sometimes, the slow and steady method—manually logging data over time—is more reliable than trying to grab a bulk export.
In the end, my analysis was built on solid, verified data, even if it took longer to collect. And that data gave me real insights into how my furnace actually performed—insights that an automated export could have easily corrupted.
The next step? Using this data to see if an air-source heat pump could handle the job.