The Twenty-Sixth ACM SIGKDD Conference on Knowledge Discovery and Data Mining will be held virtually in 2020 due to the COVID-19 pandemic. Join IBM Research AI from August 22nd to August 27th to learn more about our work. We will present several demos, talks, tutorials and papers that explore a wide range of topics ranging from healthcare to forecasting, human-centered explainability, optimization, graph representation and automated machine learning.
At the conference, we will showcase some of the work around healthcare resulting from collaborations with Watson Health and Cornell University. One relevant outcome is a novel system that enables the creation and management of predictive AI models throughout all phases of their life cycle, from data ingestion to model productization and deployment in the wild.
Other works focus on interpretable temporal models of patient records to enable clinicians in understanding the progression of a disease and providing interacting human-in-the-loop explanations. We will also be presenting a series of papers and tutorials on automated machine learning, the importance of data quality for machine learning tasks, as well as on time series forecasting.
For a full list of our papers, demos, tutorials and workshops, see below.
*In addition to our demos featured at the IBM booth, you can try IBM Research Experiments here.
ExBERT: A Visual Tool to Explore BERT : Learn how to uncover insights into what deep Transformer models understand about human language by interactively exploring their learned attentions and contextual embeddings.
Gamma: short for Go Ahead Ask Me Anything: GAAMA is a (multi-lingual) reading comprehension system for question-answering.
AutoAI for Time Series: This demo shows time series forecasting using AutoAI which automatically selects and optimizes statistics and machine learning pipelines.
Lale: Type-Driven Auto-ML with Scickit is an open-source library of sklearn-compatible, high-level Python interfaces that simplify and unify automated machine learning in a consistent way.
Command Line AI (CLAI): is an open-source project from IBM Research that brings the latest in AI and ML technologies to the command line as “skills” and seeks to make the command line user’s daily life more efficient and productive. Check out the 2020 NLC2CMD Competition on automated translation of English to the command line.
Combinatorial Black-Box Optimization with Expert Advice
Map Generation from Large Scale Incomplete and Inaccurate Data Labels Authors: Rui Zhang; Wei Zhang; Conrad Albrecht; Xiaodong Cui; Ulrich Finkler; David Kung; Siyuan Lu https://arxiv.org/pdf/2005.10053.pdf
Molecular Inverse-Design Platform for Material Industries Authors: Seiji Takeda; Toshiyuki Hama; Hsiang-Han Hsu; Victoria Piunova; Dmitry Zubarev; Daniel Sanders; Jed Pitera; Makoto Kogoh; Takumi Hongo; Yenwei Cheng; Wolf Bocanett; Hideaki Nakashika; Akihiro Fujita; Yuta Tsuchiya; Katsuhiko Hino; Kentaro Yano; Shuichi Hirose; Hiroki Toda; Yasumitsu Orii; Daiju Nakano https://arxiv.org/pdf/2004.11521.pdf
Explicit-Blurred Memory Network for Analyzing Patient Electronic Health Records Authors: Prithwish Chakraborty, Fei Wang, Jianying Hu, Daby Sow https://arxiv.org/pdf/1911.06472.pdf
A Canonical Architecture For Predictive Analytics on Longitudinal Patient Records Authors: Parthasarathy Suryanarayanan, Bhavani Iyer, Prithwish Chakraborty, Bibo Hao, Italo Buleje, Piyush Madan, James Codella, Antonio Foncubierta, Divya Pathak, Sarah Miller, Amol Rajmane, Shannon Harrer, Gigi Yuan-Ree https://arxiv.org/pdf/2007.12780.pdf
On Machine Learning-Based Short-Term Adjustment of Epidemiological Projections of COVID-19 in US Authors: Sarah Kefayati, Hu Huang, Prithwish Chakraborty, Fred Roberts, Vishrawas Gopalakrishnan, Raman Srinivasan, Sayali Pethe, Piyush Madan, Ajay Deshpande, Xuan Liu, Jianying Hu and Gretchen Jackson
Cultivating Human Expertise Through AI-Assisted Data Science Authors: Josh Andres, Christine Wolf, Michael Muller, Justin Weisz, Narendra Nath Joshi, Aabhas Sharma, Krissy Brimijoin, Michael Desmond, Zahra Ashktorab, Qian Pan, Evelyn Duesterwald and Casey Dugan
The Next Decade of Data Science Authors: Justin Weisz and Michael Muller
Human-in-the-Loop Automated Data Science Outperformed Human Data Scientists in Model Building Authors: Dakuo Wang, Josh Andres, Justin Weisz, Erick Oduor, Udayan Khurana, Horst Samulowitz, Arunima Chaudhary, Abel Valente, Dustin Torres and Casey Dugan.
Hybrid Edge-Cloud based Ensemble Learning for Forecasting Occupancy of Open-plan Offices Authors: Fatemeh Jalali, Subhrajit Roy, Ramachandra Rao Kolluri, Maneesha Perera, Mahsa Salehi, John D. Vasquez, Julian de Hoog
Explainable AI based interventions for pre-season decision making in fashion retail Authors: Surya Shravan Kumar Sajja, Nupur Aggarwal, Sumanta Mukherjee, Kushagra Manglik, Satyam Dwivedi, Vikas Raykar
Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.”
The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.