There is no question big data and cloud computing are two of the hottest IT topics today. Demand for skilled individuals in these areas and the salaries offered are growing quickly. Fortunately, both areas are somewhat related, so you can start your education in big data, and at the same time experience and learn cloud computing concepts along the way. Though you can spend some time researching these topics on the internet, there is a better and easier way: Explore the free courses in Big Data University.
BigDataUniversity.com is an online educational web site offering free courses about big data, and databases. The site is run by the community which includes many IBMers contributing voluntarily to the development of courses, and to enhancing the site. Learn @yourpace, @yourplace from the industry's best is their motto. What is appealing about Big Data University is that most of the courses include hands-on labs that you can perform on the cloud. For example, one of the courses in Big Data University is sponsored by Amazon Web Services which is providing a 25-dollar credit to learn big data on their cloud. Each course in Big Data University has a short test you can take, and if you pass it, you can print yourself a certificate of completion.
This article lists the courses currently available in Big Data University and the ones that are soon to be published. Though none of the courses have prerequisites, there is a suggested path for you to take them in order.
Big Data University courses are classified in three categories:
- Big data-related topics
- Database (DB2) related topics
- Miscellaneous topics
Figure 1 shows the list of courses in the big data category and the suggested order we recommend you to take them (top to bottom, then left to right) depending on your current knowledge of big data concepts.
Figure 1. Big Data University courses - Big data category
The "Big data analytics demos" course at the top of the figure provides an overview of what big data is, why it's important, and its characteristics. It also introduces you to the concepts of data-at-rest analytics (think of an ocean as an analogy: huge amounts of data, but not really flowing), and data-in-motion analytics (think of a river or a stream as an analogy: streams of data constantly flowing and having to analyze them in real-time).
The courses on the left of Figure 1 ("Hadoop fundamentals I," "Hadoop and the Amazon Cloud," "Hadoop and the IBM SmartCloud Enterprise," and, in beta,"Hadoop Fundamentals II") are mainly for data-at-rest analytics. They teach you how to work with Hadoop, an open source Java framework that helps you process large amounts of data quickly. Note that these courses have labs that you can run either on the Amazon Cloud or the IBM SmartCloud Enterprise. We suggest you take the courses in this section in the order listed, from top to bottom.
In the center of Figure 1, three courses are listed:
- "Spreadsheet-like analytics" (in beta) allows non-technical users to take advantage of big data technologies without having to learn how to write a program to run Hadoop, JAQL, and so on. It uses BigSheets, a plug-in that can be run on top of Hadoop, and is designed for the business user who is familiar with spreadsheet tools such as MS Excel.
- "Text Analytics Essentials I" teaches you the basics of how to perform analytics on unstructured data, such as the content of an email, or any other document. It uses Annotation Query Language (AQL) to specify how to filter the information. A text analytics Eclipse plug-in can be used to develop the AQL which can later be deployed on top of Hadoop to crunch big data.
- "Query Languages for Hadoop" (in beta) teaches you how to work with query and scripting languages such as Hive, Pig, and JAQL. This scripting languages simplify the development of map-reduce programs in Hadoop for developers with no Java expertise.
On the right of the figure you see the list of courses soon to publish for data-in-motion analytics ("Stream computing I" and "Stream computing II," both in beta). They will discuss for example, how to analyze tweets or Facebook comments as the data is flowing in real time. They will also discuss how to perform log analysis, complex event processing, and more.
Figure 2 shows the list of courses in the database (DB2) category and the suggested order we recommend you to take them (top to bottom, then left to right) depending on your current knowledge of database concepts.
Figure 2. Big Data University courses -- Database (DB2) category
The "SQL fundamentals I" course at the top of Figure 2 is an introductory course that not only teaches you SQL, but also basic concepts about relational database management systems, and other systems. Take this course and read the book Database Fundamentals for the best learning experience.
The courses on the left of Figure 2 provide you with a solid foundation of core DB2 concepts. Take the "DB2 essential training I" and "DB2 essential training II" courses and read the book Getting started with DB2 Express-C for optimum results.
The soon to publish "What's new in DB2 10" course explains the new features available with the latest release of DB2 for Linux, UNIX, and Windows. It will include videos with demonstrations about features such as time travel query, multi-temperature storage, Oracle compatibility, and more.
In the center of Figure 2 there is one course listed: "Data Studio Essential Training I." At the time of writing, this course was being updated to the latest version of Data Studio; however, you can review the videos in the course to get familiar with the Data Studio, even though the videos were created for a previous version of the product.
Finally, on the right side of Figure 2, the course "DB2 academic training - 302A exam preparation" is listed. This course prepares you for IBM Exam 302A, developed for the academic community. It includes 13 lessons and a sample test that will give you a good indication of how you would do in the real exam.
Figure 3 shows the list of courses in the Miscellaneous category.
Figure 3. Big Data University courses – miscellaneous category
The "Creating a course in Big Data University" course provides all the instructions required for anyone interested in developing a course to publish in Big Data University. We encourage you to review this course, and find how easy it easy to create your own course. Though all the courses in Big Data University to date are free of charge, if you would like to develop a course that requires a fee, Big Data University has the capability to support this.
Finally, the "Open source development" course (in beta) includes a list of open source tasks that need to be implemented to support Big Data University features. Members from the community willing to help develop these features using PHP are free to contact us, and we can grant access to this course to review projects or tasks that need to be completed.
This article talked about the different courses available at Big Data University that you can take to enhance your skills in Big Data technologies, as well as database technologies. The figures presented in the article provide a suggested path or order that you should follow. All the courses in Big Data University are currently free, have hands-on lab exercises, and allow you to print your certificate of completion after passing a test.
Big Data University is a community site sponsored by IBM. We invite community members to develop new courses, and the "Creating a course in Big Data University" course has all the instructions to get started.
- Learn more at the Big Data University home page.
- Review the article "Get
started with Hadoop-based data analytics in IBM SmartCloud Enterprise"
(developerWorks, Oct 2011) to learn how to set up a Hadoop cluster in the
IBM SmartCloud Enterprise.
- Follow Big Data University on
- Learn more about IBM Cloud offerings at ibm.com/cloud.
Get products and technologies
Download a free trial version of DB2 for Linux,
UNIX, and Windows.
Now you can use DB2 for free. Download DB2 Express-C, a no-charge version
of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to
build and deploy applications.
Download IBM Data
Studio, available at no charge. Data Studio provides an integrated, modular
environment for productive database administration and also includes collaborative
database development tools for DB2, Informix, Oracle, and Sybase.
- Like us on Facebook at facebook.com/bigdatauniversity.
- For current cloud computing information,
visit the blog Thoughts on Cloud.
Raul F. Chong is a senior program manager working in the Information Management Cloud Computing Center of Competence. He has been working in IBM for 14 years as a database consultant, support specialist, information developer, and technical evangelist. His main areas of expertise are in Cloud Computing, big data, and databases.