IBM targets data scientists with a new development platform based on Apache Spark

ibm data science experience

IBM's Data Science Experience is the first native Spark platform for data scientists and developers, the company says.

Credit: IBM

The goal is to make analytics easier by putting tools within easy reach

Making sense of data can involve a wide variety of tools, and IBM is hoping to make data scientists' lives easier by putting them all in one place.

The company on Tuesday released what it calls Data Science Experience, a new development environment in the cloud for real-time, high-performance analytics.

Based on data-processing framework Apache Spark, Data Science Experience is designed to speed and simplify the process of embedding data and machine learning into cloud applications. Included in the new offering are tools such as RStudio and Jupyter Notebooks.

Developers can tap Python, R and Scala; they can also view sample notebooks and watch tutorials while they code. Additional tools focus on data preparation and cleaning, visualization, prescriptive analytics, data connections, and scheduling jobs. Users can collaborate with others and share their code.

Data Science Experience is now available on the IBM Cloud Bluemix platform.

"Computer science went mainstream with the introduction of the PC,” said Bob Picciano, senior vice president of IBM Analytics. “With data science, the major roadblock is having access to large data sets and having the ability to work with so much data.”

IBM has invested US $300 million in Apache Spark, including contributing to SparkR, SparkSQL, and Apache SparkML.

The Data Science Experience combines the best of three worlds, said Mike Gualtieri, a principal analyst with Forrester.

First, "it is cloud-based, so it will be easily accessible to all comers," including seasoned data scientists, citizen data scientists, and application developers, Gualtieri said.

Second, the platform offers multiple open-source tools, including the Jupyter data-science notebook, he added.

Finally, "the power of Apache Spark is behind these tools," Gualtieri said, allowing users to analyze data with machine-learning tools at in-memory speeds in the cloud.

Companies are increasingly recognizing the potential of artificial intelligence in business software.

"Adding intelligence to applications, whether you call it AI, machine learning, or cognitive computing, is now top of mind for enterprises," Gualtieri said.

Must read: Hidden Cause of Slow Internet and how to fix it
View Comments
Join the discussion
Be the first to comment on this article. Our Commenting Policies