Apache Spark 1.2.0 released

This release brings operational and performance improvements in Spark core including a new network transport subsytem designed for very large shuffles. Spark SQL introduces an API for external data sources along with Hive 13 support, dynamic partitioning, and the fixed-precision decimal type. MLlib adds a new pipeline-oriented package (spark.ml) for composing multiple algorithms. Spark Streaming adds a Python API and a write ahead log for fault tolerance. Finally, GraphX has graduated from alpha and introduces a stable API.

Why Big Data and AI Need Each Other — and You Need Them Both – Forbes

Go ahead: give yourself a pat on the back. You’ve been doing a great job with big data. You’re collecting and analyzing customer information, gleaning insights into what customers want and need, and acting on those insights. For the first time ever, you’re able to position products to respond to customers’ greatest needs — and you know it’s working, because you’re collecting data that proves it. You’re way ahead of most of your peers in deriving real value from big data. But you’re not done yet. If you want to stay competitive as data growth continues to skyrocket, you’re going to have to do much more to get the maximum value from the customer data you’re collecting. And to do it, you’re going to need artificial intelligence / machine learning.