IBM announced in June that it has embarked on a quest to create a million new data scientists. It will be adding about 230 of them during its Datapalooza educational event this week in San Francisco, where prospective data scientists are building their first analytics apps.
Next year, it will take its show on the road to a dozen cities around the world, including Berlin, Prague, and Tokyo.
The prospects who signed up for the three-day Datapalooza convened Nov. 11 at Galvanize, the high-tech collaboration space in the South of Market neighborhood, to attend instructional sessions, listen to data startup entrepreneurs, and use workspaces with access to IBM’s newly launched Data Science Workbench and Bluemix cloud services. Bluemix gives them access to Spark, Hadoop, IBM Analytics, and IBM Streams.
Rob Thomas, vice president of product development, IBM Analytics, said the San Francisco event is a test drive for IBM’s 2016 Datapalooza events. “We’re trying to see what works and what doesn’t before going out on the road.”
Thomas said Datapalooza attendees were building out DNA analysis systems, public sentiment analysis systems, and other big data apps.
Note that this article was submitted and accepted by KDnuggest, the most popular blog site about machine learning and knowledge discovery.
I have been using Lean Six Sigma (LSS) to improve business processes for the past 10+ year and am very satisfied with its benefits. Recently, I’ve been working with a consulting firm and a software vendor to implement a machine learning (ML) model to predict remaining useful life (RUL) of service parts. The result which I feel most frustrated is the low accuracy of the resulting model. As shown below, if people measure the deviation as the absolute difference between the actual part life and the predicted one, the resulting model has 127, 60, and 36 days of average deviation for the selected 3 parts. I could not understand why the deviations are so large with machine learning.
After working with the consultants and data scientists, it appears that they can improve the deviation only by 10%. This puzzles me a lot. I thought machine learning is a great new tool to make forecast simple and quick, but I did not expect it could have such large deviation. To me, such deviation, even after the 10% improvement, still renders the forecast useless to the business owners.