Last week we hosted a webinar between 451 Research and IBM about Big Data and the Rise of Spark. Here is a summary of some of what Donnie Berkholz, Research Director, Development, DevOps & IT Ops had to say on the topic.
How are we solving data science?
Research from KDnuggets in May 2015 asked what analytics, big data, data mining, data science software you used in the past 12 months for a real project. The trend over the last three years shows that software around code is rising suggesting a rise in data science over data analysis. Further down, we see an increase in Hadoop as well. Then, in just two years, we see Spark come out of nowhere to grow to 11.3% that have deployed data science over the last year. Some of the most popular software out there was still not used by even half of the population just two years ago. Spark has taken a third of that market in just two years.
So how are we solving problems in data science?
If we restrict the list to just big data tools, you can see how big Spark has become in two years. Very recently, we are now at the key usability with Spark.
Developers Love Spark
Research from an analysis on Stack Overflow by Redmonk shows very clearly that a number of technologies have grown, but that Spark has come out of nowhere to completely explode over the last 18 months. It shows that you need to be aware of Spark and know how to use it. Spark has emerged out of the pack, out of all the big data technologies, to get developers excited and technology that delivers excitement gets implemented.
It’s early days for dollars
In terms of commercial interest in Spark, it was pretty far down the list. In terms of commercial offers it is pretty early days. But that is changing. 451 Research shows the Hadoop market, which Spark is a part of, is predicted to grow from millions to billions over the next four years. This means, there will be a lot of more tools to select from.
What challenges are hindering big data success?
A survey from Talend showed that the challenges around big data are people and resource – in house expertise and allocation of sufficient budget/time/resources. There is a huge skills gap in the market because we cannot find enough people with big data expertise. It is difficult to find and hire these people. We therefore require technologies that are easy to implement and train. Spark gives us that technology.
Donnie said the problems we had up to the present day in terms of dealing with big data problems include (for more detail on each, listen to the webinar):
How can we get from here to there? Data engineering needs DevOps
DevOps in short is to take agile software development and pushing it through to production. It is agile – truly tip to tail. The most important part is for DevOps to benefit the business and the customer – you can deliver business value faster.
For more information on DevOps, the culture, automation and removing the silos from Donnie Berkholz listen to the webinar now. You’ll also be able to listen to the discussion between IBM, 451 Research and Flexiant.