My IBM

The Best of Data Science and Machine Learning – The Hackathon Winners

11 February 2022

3 min read

Find out about the hackathon that brought together contestants who learned new technologies around data science and machine learning, and published their projects on the Cloud Pak for Data Gallery.

The IBM Academy of Technology is made up of enthusiastic IBMers who get together to work on activities of their own choice that they think will be useful to themselves and to IBM. The Best of Data Science & Machine Learning (ML) Projects Hackathon, led by Thomas Schäck, Distinguished Engineer for Watson Studio, is an example. In 2021, the hackathon brought together contestants who learned new technologies and put them into practice, extended their network, and located colleagues with suitable expertise – and they had fun too.

Two maps showing results of flood risks from the data science and machine learning hackathon project.

From the competing projects, two were published on the Cloud Pak for Data Gallery:

1. Flood risk :
Floods are the most frequent type of natural disaster and can cause widespread devastation, resulting in loss of life and damages to personal property and critical public health infrastructure. Flooding occurs in every U.S. state and territory, and is a threat experienced anywhere in the world that receives rain. According to NOAA (link resides outside ibm.com), in the U.S. floods kill more people each year than tornadoes, hurricanes, or lightning. Understanding flood risk is important so that people don’t ignore the warnings sent out by agencies like the National Weather Service (NWS) (link resides outside ibm.com). And warnings need to be more specific, pinpointing certain areas and exposed locations. This project extrapolates the FAIR Model for flood analysis. FAIR, short for “Factor Analysis of Information Risk” (link resides outside ibm.com) is the only international standard quantitative model for information security and operational risk. As described in wikipedia (link resides outside ibm.com), FAIR underlines that risk is an uncertain event and one should not focus on what is possible, but on how probable a given event is. This probabilistic approach is applied to every factor that is analyzed. The risk is the probability of a loss tied to an asset. In FAIR, risk is defined as the “probable frequency and probable magnitude of future loss”. FAIR further decomposes risk by breaking down different factors that make up probable frequency and probable loss. These factors include: Threat Event Frequency, Vulnerability, Threat Capability, Primary Loss Magnitude, Secondary Risk. The project calculates the Loss Event Frequency based on how vulnerable and susceptible the flood location is. And finally, based on the severity of the Flood alert and Loss Event Frequency, the Final Threat Level is calculated.
2. Site search :
Site search recommender improves search relevancy by using user behavior data from ibm.com search and de-identified for public consumption. It’s built using open-source deep learning libraries (TensorFlow (link resides outside ibm.com) and Keras (link resides outside ibm.com)) and implements the collaborative filtering algorithm to make meaningful recommendations to users based on their search data terms and historical search behavior. Benefits of this project include allowing data scientists to improve relevancy of corporate site search results, serving as boilerplate to provide out-of-box support for search use case and leverages data and AI to solve real-life search and discovery challenges.

You can try the projects Flood risk and Site search yourself. Note:

If you already signed up for Cloud Pak for Data as a Service, you can just sign in and create a project from the project samples.
Otherwise, you can sign up for Cloud Pak for Data as a Service here to get started.

With many thanks to the data science community in the IBM Academy of Technology for their energy, dedication, and determination and to the two teams who created these projects.

Authors:

Thomas Schaeck, schaeck@de.ibm.com

Susan Malaika, malaika@us.ibm.com

Author

Open Innovation Community

Admin User

Footnotes

The content in this blog post is the opinion of the author. For more on the IBM Academy of Technology, see these posts:

Four steps to better business forecasting with analytics

Use the power of analytics and business intelligence to plan, forecast and shape future outcomes that best benefit your company and customers.

Resources

Explore IBM Granite

IBM® Granite™ is our family of open, performant and trusted AI models, tailored for business and optimized to scale your AI applications. Explore language, code, time series and guardrail options.

Managing data for AI and analytics at scale

Learn how an open data lakehouse approach can provide trustworthy data and faster analytics and AI projects execution.

Data science and MLOps for data leaders

Use this ebook to align with other leaders on the 3 key goals of MLOps and trustworthy AI: trust in data, trust in models and trust in processes.

Increase AI adoption with AI-ready data

Discover why AI-powered data intelligence and data integration are critical to drive structured and unstructured data preparedness and accelerate AI outcomes.

The data differentiator

Explore the data leader’s guide to building a data-driven organization and driving business advantage.

How to choose the right foundation model

Learn how to select the most suitable AI foundation model for your use case.

Unlock the Power of Generative AI + ML

Learn how to incorporate generative AI, machine learning and foundation models into your business operations for improved performance.

Architectural thinking in the Wild West of data science

Learn why having a complete freedom in choice of programming languages, tools and frameworks improves creative thinking and evolvement.

Take the next step

Unify all your data for AI and analytics with IBM® watsonx.data™. Put your data to work, wherever it resides, with the hybrid, open data lakehouse for AI and analytics.

Discover watsonx.data

Explore data science solutions