NetApp CIO Cynthia Stoddard has invested in Big Data to help the storage manufacturer make sense of the rapidly increasing volume of information that is generated by performance monitoring software running on top of customer systems.
Stoddard said the move to a Hadoop-based system has yielded several advantages, including cost savings and efficiency. Most critically, the shift has helped NetApp develop new insights into what customers want and how they use the company’s products.
Stoddard, who became CIO in March after 18 months as a vice president of IT management at NetApp, said that data generated by AutoSupport software, which monitors the storage devices that customers purchase from NetApp, doubles in volume every 16 months. She determined that the company’s Oracle database software wasn’t effectively processing the information from AutoSupport, which detects errors that occur in NetApp devices that customers run.
“I sit on thousands of customers’ data and what I do with that data is essential to the company,” Stoddard told CIO Journal. “I need to react and help customers do more with their systems.”
She decided to move to a new database system. Stoddard and her team evaluated several systems before choosing a solution from Cloudera, which sells software based on the open source Hadoop file management system. Hadoop accelerates data processing by creating replicas of data chunks and distributing them on computers across an organization. Google, Samsung and Morgan Stanley all use Hadoop for Big Data, which enables users to process large chunks of unstructured data such as e-mail and social media content.
The new database compares customer product configurations against the 24 billion records that AutoSupport has collected. When the database finds an old incident record that matches the symptoms of a current customer problem, it sends an alert to NetApp workers, who can address the immediate issue and study the pattern of breakdown. “In order to do the type of analytics that we need to do, we’re now able to pull all of that information in and do it justice,” said Stoddard. The new system also helps NetApp reduce support calls by anticipating potentially faulty products and providing fixes before they break down.
The process of loading data and running queries in the database once took four weeks to complete. The team can now produce query results in roughly 11 hours. Stoddard says that NetApp’s data-crunching costs have been cut roughly in half, but that it’s too soon to measure the impact of the new database on company finances. NetApp’s revenues totaled $6.23 billion, up 22% compared to revenues of $5.12 billion for fiscal year 2011.
Stoddard says the new database’s biggest payoff is that it provides fresh insight into the products and features that NetApp customers favor, and will help the company improve its offerings. Indeed, NetApp CEO Tom Georgens wants Stoddard to use information collected about popular products and features to help the sales and marketing teams identify leads and target new customers. Data collected from Hadoop-based software could eventually augment the company’s product development, which will help it compete with other storage vendors. Stoddard declined to outline specific plans for such tasks.
To be sure, NetApp isn’t the only storage vendor tapping Big Data. A group within EMC uses rich data sets to help the company analyze customer experiences, ultimately to improve the company’s products and services. EMC is the leader in the $30 billion-dollar-a-year market for disk storage; IDC said June 8 EMC commanded 29% revenue share in the first quarter, followed by NetApp with 14.1% share.
NetApp’s Hadoop installation went without a hitch, Stoddard said. But if there is one area where NetApp is challenged it’s in finding enough employees with the skills to install Hadoop systems; she said it’s hard keeping her team of around 10 Big Data workers intact. “Once employees get a taste of these [Hadoop] skills, they become extremely marketable,” Stoddard said.