In the US, we’re deep into Presidential Election season. On each side, we have candidates who are strong on some issues, weak on others, and all have a position and perspective to share. While the policy wonks pay attention year-round and are deep into the nuance of every issue, it’s hard for the average voter to keep up. Even worse, when there are important topics that voters want to learn more about, the candidates often don’t provide context or explain the basics. The good news is that it doesn’t have to be that way.
Understanding the Debates
The first step to understanding any policy issue is knowing the events that are driving the attention and interest around them. While you could Google things as you hear them, it would be more effective to use some of the technology available, specifically the Clarify.io and the AlchemyData News APIs.
The AlchemyData News API allows you to provide news and blog content based on keywords and concepts, but further curated with Natural Language Processing. It’s not just a simple search, but is information annotated and enriched with sentiment analysis, named entity recognition, and even relationships. In short, it understands that when a candidate mentions Vladimir Putin and Georgia, they probably don’t mean Atlanta.
The Clarify API is quite a bit different. It is an API for analyzing and understanding audio and video content. This isn’t just a transcript but a timestamped wordlist that is then used to extract keywords, topics, and can even allow you to search the audio and video.
By combining these APIs, we can build something that both identifies important segments of the debates and also allows us put the information in context quickly and easily.
So let’s turn the debates into something we can work with.
Note: If you try this experiment yourself, you don’t have to download the files. Simply use the curl commands below with your API key and save some time.
For each of these requests, we’ll receive a bundle_id representing each file. This bundle_id will give us access to keywords, topics, and any other information we want to access. Keep it on hand for the next steps.Note: Processing takes one minute for every minute of video so we’ll have to wait before continuing to the next step.
Now that processing is complete, we can use the bundle_id to understand the debates.
Requesting the keywords insight on each debate, we get can use curl to get the results:
And then combine those results into a simple Venn Diagram as shown:
There are three categories of keywords that stick out:
- First, we see many of the candidates referenced or referred to – O’Malley, Sanders, Trump, Marco Rubio, Hillary, Fiorina;
- Next, we see a number of locations – Bernadino (misspelled), Bernadine (again, misspelled), Syria, [New] Hampshire, and Libya;
- Next, a number of world leaders – Moammar Qadaffi (also spelled as Khadafi), Barack Obama, and Putin;
- Finally, we have some common words that don’t fit into any of these categories: re-invigorate, re-build, tax-payers, tuition, and others. While we could research these terms, getting useful results will be more likely if we combine them with the candidates’ names. That will give us their position on these topics.
Without knowing anything else about the debates, we’re already building a “cheat sheet” on the major topics, people, and locations involved. Now let’s take those terms and names and use the AlchemyData News API to add color and understanding.
After downloading an API key, we can then use the AlchemyData News API to search for relevant news and information. There are 400+ different facets available to search the news. Fortunately, the AlchemyData News API provides the ability to simplify the query and limit the keyword location to the titles:
Putting it all Together
Now we have all the pieces, let’s combine them.
While we might want to do something more advanced, the basics are all there. Converting the above snippets into PHP, we can use Clarify.io to request the keywords, iterate over them, and submit each to the AlchemyNews API and retrieve the results. The complete code looks something like this:
The most important part, however, are the results. This automatically retrieves the ten most recent news articles for each of our keywords. Some of the terms – like re-invigorate – aren’t likely to be useful but the rest give us the context that we need to understand the issues being discussed.
Now that we have this code, we could load any important audio or video – other debates, the State of the Union Address, or even just last night’s news – and extract their details to understand the basics of any issue, topic, or person’s position. It makes understanding anything in a matter of minutes.