We often forget how new Spark is. While it was invented much earlier, Apache Spark only became a top-level Apache project in February 2014 (generally indicating it’s ready for anyone to use), which is just 18 months ago. I might have a toothbrush that is older than Apache Spark!
Since then, Spark has generated tremendous interest because the new data processing platforms scales so well, is high performance (up to 100 times faster than alternatives), and is more flexible than other alternatives, both open source and commercial. (If you’re interested, see the trends on both Google searches and Indeed job postings.)
Spark gives the Data Scientist, Business Analyst, and Developer a new platform to manage data and build services as it provides the ability to compute in real-time via in-memory processing. The project is extremely active with ongoing development, and has serious investment from IBM and key players in Silicon Valley.
Tips for getting started with Apache Spark
Given the great potential to revolutionize advanced analytics for big data and modern applications, the IBM Analytics for Apache Spark team is frequently asked for our tips on great resources to help get up-to-speed on Spark.
Below is our team’s list of recommended resources that we share with you in anticipation of the IBM Analytics for Apache Spark open beta:
You have no idea what Spark is and want to at least be informed
Few days back, I started the Spark Fundamentals I on bigdataunversity.com. Downloaded the 5+ GB QSE image but was surprised to found that the Spark service is missing (when I started all services) and on digging deeper (when failed to start the spark-shell) found out that the spark binaries are not present in the required folder [the soft link spark-client -> /usr/iop/188.8.131.52/spark is there but the actual binaries are missing].
Had to waste a lot of time troubleshooting. When I could not fix the problem with the Spark image (actually I tried to install and build spark out of desperation), I am now trying to see if the alternate docker images works.
Posted on the help section in bigdataunversity.com but no one replied. I am surprised there are no forums on bigdataunversity.com – searched for it a lot but could not find any related link. Can you help me out please – I need to be quickly up with Spark both for professional and academic reasons
Fortunately now I find that I can get the spark-shell up and running with docker-image but I would love to get the same on the QSE (Biginsights) image – the Apache Ambari simply does not show the Spark service up even though I do “start all” from console or run “restartAll.sh” from terminal. Looking for your input and help!
I am surprised by the author’s unresponsiveness to the problem I faced in bigdatauniversity spark course. It is the author who suggested bigdatauniversity course and when we faced problem and mentioned about it, he was silent. This is big sense of irresponsibility. If you do not know the answer, at least admit it – do not be silent.
Fortunately I could find my answer to the question in forum. The problem was I could not locate the forum link.
As your organization explores more digital initiatives, including cloud and mobile, the importance of identity and access management (IAM) is paramount. Nearly all IT decision makers we talk with agree that IAM is essential to the success of their company’s cloud adoption and it is seen as a key enabler for mobility, analytics and IoT initiatives.
Over the past few years, we’ve seen a significant rise in popularity for intelligent personal assistants, such as Apple’s Siri, Amazon Alexa, and Google Assistant. Though they initially appeared to be little more than a novelty, they’ve evolved to become rather useful as a convenient interface to interact with service APIs and IoT connected devices.