Every so often, I get to meet someone who's not only doing cool IT, but doing it for a really cool company. The cool IT person in this case is Paul English of 3TIER.
What makes 3TIER so interesting? They're doing the modern equivalent of oil exploration -- providing the supporting data and analytics for a growing number of clean energy projects.
Yes, there's big data involved. But there's also some amazing insights into a fascinating part of the IT landscape that should matter at least in small part to each and every one of us.
3TIER's Business Model
Or, if you've got a location in mind, what are the long-term prospects for energy generation?
The good news? Much of the data is free, or -- at least -- already paid for. For example, the US federal government (specifically, the National Weather Service) provides extremely detailed longer-term forecasts of all sorts of weather, including wind speeds.
The bad news? That's not nearly enough. If you're going to say, approach a bank for a loan for your wind farm, you're going to need to show them more than some nice long-term weather forecasts from the government.
Now, take that weather forecast, mash it up with historical weather patterns, add in some brilliant PhDs in meteorology, and -- voila! -- you've got a decent independent assessment of the future wind potential of that location.
Enough to get a loan to get that project built.
Much in the same way analyzing oil exploration data built the foundation for carbon-based energy, 3TIER's analytics are building the foundation for many forms of renewable and clean alternative energy.
The "wind" business gave way to hydro (think rainfall and dams) which gave way to the current hot topic: solar. 3TIER's latest fast-growing business is taking satellite data showing solar radiation and helping to justify the latest round of solar farms: both large and more modest.
Forecasting Matters More Than You Might Think
There's an important real-time nature to alternative energy forecasting that initially escaped me, but Paul patiently explained.
Put up a wind farm, for example, and you're selling your energy to the power utility. They expect a relatively precise forecast (even hour-by-hour) as to how much energy you intend to produce, since you're just one source among many.
Over or underproduce your forecast, and there are financial penalties involved -- everything from lower prices for your power, all the way to explicit charges for dumping too much power on the grid. So accuracy here really matters: certainly when the project is being justified, but -- more importantly -- every day that the project is producing power.
To meet these requirements, 3TIER offers two services: assessments and forecasts. The first service is basically smart people backed by lots of data and powerful models. The second is a real-time data feed with associated analytics (including the customer's own data) that helps accurately forecast how much your wind farm, dam or solar farm will output at any specific time.
More recently, 3TIER is now offering a web service targeted at, say, smaller solar projects that can't initially justify an expensive consulting engagement. For $1500, you can get a quick look at how much sun will likely fall on your given location.
And, of course, all the data needs to be online for that one :)
3TIER As A Company
Based in the Pacific Northwest, 3TIER is rather modest -- only 50 employees or so. However, about 30% of them are PhD-level scientists in hydrology, meterology, solar radiation, etc. As Paul puts it: "it's a joy to be supporting such bright people". In EMC vernacular, these are the data scientists.
The primary "engine" (factory?) is a home-grown compute cluster of about 300 commodity nodes, backed by some 500TB of primary storage. Tape is only used for recovering important data sets; all primary data sets of interest live on disk, Isilon in this case.
Behind the data scientists are 4-5 "data logistics" people who are basically scheduling data transfers in and out of the compute environment. Behind them, Paul is the leader of a hearty troop of 5 people who provide the supporting IT infrastructure and services, in addition to more pedestrian pursuits such as desktop support, phone service, email, etc.
But, if you think about his environment, it's absolutely clear what the primary mission of 3TERI's IT is: supporting the scientists who in turn support their customers. Everything else is largely extraneous. That's a very pure and simple model -- one that I think more than few people reading this might envy.
Business is good. Although they're privately held and VC-backed, Paul was pretty clear that there was increasing demand for 3TIER's services. The quality of the forecast is the differentiator in this marketplace -- good forecasts beat average ones every time. And, given the strong interest internationally in alternative energy projects (think India and solar for example), 3TIER seems to have plenty of room to run.
Newer areas of interest include tidal & wave energy, as well a geothermal. And, of course, any investment in the quality of their analytics product, or making it easier to consume -- all will likely be well rewarded by paying customers.
Doing IT At 3TIER
Paul was one of the original four 3TIER employees 8 years ago. He tells stories about running their compute complex in a low-rent office with no air conditioning, and running down to Home Depot for some ducting to route the heat outside. It was a big day when they got their first AC unit. Of course, they now had to take turns emptying the condensation :)
When I asked him what his users wanted these days, the answer was simple: "more". More storage, more compute, more bandwidth and so on. That's not surprising, given 3TIER's business. Of course, any time the compute cluster is down (or storage isn't available) that's a big issue -- but that's against a backdrop of the insatiable desire for more, more and even more.
Although 3TIER is in the business of helping their customers forecast better, the same can't really be said for the scientists who supply that expertise -- and their ability to forecast IT requirements. Not surprisingly, big projects that demand big resources materialize seemingly out of nowhere, putting a premium on Paul's ability to react quickly and reposition existing resources.
One thing that jumped out at me during the call is that Paul and his team have an excellent sense of the comparative value of their various data sets. For example, they know how much they paid for the source data sets. They know how many man-hours and compute-hours went into analyzing the data. And they know how much their customers will pay for the forecasting analytics based on that data. There wasn't an overly formal tool or methodology, but they knew it all quite well.
If you think about it, that's quite unique in the broader IT spectrum -- I mean, how many IT organizations have a well-honed sense of the precise business value associated with each and every data set sloshing around the IT infrastructure? Paul was able to tick off several examples how this direct knowledge impacted choice of storage media (e.g. spinning disk vs. tape) or data protection approach.
Food for thought. If you think about it, 3TIER has two primary assets: really smart scientists, and the data sets that support them.
A Great Gig?
I asked Paul about the satisfaction he got from working at 3TIER, and he was obviously quite pleased with his situation. He got to work with cool technology. He got to work with very bright people. He has the luxury of a very clear mission towards a goal that benefits everyone on the planet. And he got to do it inside a small, collegial organization near Seattle.
It was pretty clear to me that he wasn't going to trade that for anything else anytime soon.
I suppose that's one of the reasons I wanted to talk to Paul when I first met him. He seemed way too happy to be working in IT :)
What's On Paul's Mind?
I always like to end these interviews with an open ended question: what should EMC (and other vendors) be thinking about going forward?
We immediately got into a discussion around storage, and -- more specifically -- bigger and more capacious disks. I offered up my personal opinion that -- based on what we were seeing from drive vendor roadmaps -- the capacity improvements weren't coming as often as they used to be. This meants, invevitably, more spindles in 3TIER's dramatically growing environment, which moved the discussion to better approaches to managing more spindles in more nodes.
Indeed, this challenge is what got him hooked on Isilon in the first place.
Paul spends a lot of time looking at application-level performance. The Isilon tools (specifically InsightIQ) gave him a good handle on the storage perspective, and he had similar tools that could provide insight from the server and application side. Although he mentioned that a more detailed and integrated view (preferrably one that could be understood by the scientists vs. IT guys!) was interesting, it wasn't really a major issue at this point.
On the performance front, he's got a strong interest in how pNFS 4.1 might work in his environment. If you recall, pNFS 4.1 uses a file system for metadata, but the data path between storage and server is block-oriented with nothing in the way.
The discussion turned more serious when we got into data protection and disaster recovery. Right now, Paul backs up his high-value data sets to tape. If their primary site should be unavailable for a long period of time, it wouldn't be the end of the world, but it certainly would degrade their ability to provide high-quality analytics.
Not surprising, he eventually wants to create a "second site", but understands that's a significant investment for any company, especially 3TIER. And that's not just storage :)
I also asked him how attractive external cloud-like services might be in his world, especially given the data and compute intensive nature of his business, and the somewhat unpredictable nature of his users' demands. He was quite pragmatic -- he's looking, but hasn't yet found anything better (functionality or economically) than his in-house solution that they've built.
As we all work to understand the phenomenon of big data, I find it fascinating to understand the people who are really doing it today: what they're doing, and how they're doing it.
Yes, it's a very different style of IT -- compared to the more familiar traditional enterprise IT model.
To quote William Gibson: "the future is already here -- it's just not evenly distributed".