Like many people working in Digital Forensics and Incident Response, I’m drawn to a good mystery.
The X-Force IR team was recently called in by a client to investigate why their Endpoint Detection & Response (EDR) tool was alerting on a file within the system files of a user’s Microsoft Edge web browser. The file in question was ‘Network Action Predictor’, which is a database storing web browsing artifacts in Chromium-based browsers such as Chrome, Edge and Brave.
In this article, I’m going to cover how I managed to parse the useful information from the Resource Prefetch Predictor tables in the Network Action Predictor database, and how it could be of use in your Digital Forensics, Incident Response or Threat Hunting workflow.
Using everyone’s favourite SQLite DB browser, DB Browser for SQLite (DB4S), allowed us to very quickly confirm the domain names that the EDR tool in question alerted on.
Googling ‘Network Action Predictor’ led me to Kevin Pagano’s 2021 blog on the topic, which is a great overview, but the table I was most interested in, resource_prefetch_predictor_origin, wasn’t mentioned.
There is protobuf data stored within a blob in each record within four of the six tables in the database:
of the other two:
Loading the blob from the resource_prefetch_predictor_origin table that was causing the EDR alerts into the “Cyber Swiss Army Knife”, CyberChef, as suggested in Kevin’s blog, gave the following:
I’m using the starwars.com entry from the same sample Android 11 image as Kevin’s blog as sample data above, but in the real investigation there was a mix of malicious and legitimate URLs featured.
There is something there, but what? I played around with CyberChef, trying to build a ‘.proto’ schema file to help decode this, but ended up labelling lots of fields UnknownDataA, UnknownDataB, etc. While doing this, I was skimming Google’s Protocol Buffers documentation and realized that the browser would have needed such a file to read and write the data for its own purposes, and it occurred to me that:
This turned out to be the case, and once I’d provided the relevant sections of resource_prefetch_predictor.proto to CyberChef, the results were a lot more readable:
The next goal was parsing these URLs with a Python script so that they can be sent to a reputation checking API or added to a timeline.
Google provides guidance for working with protobufs in Python, but it’s a touch fiddly and probably too time-consuming for the harried forensicator during a case. Luckily, once the module has been generated for a ‘.proto’ file, it can be reused. So here’s one I prepared earlier, adapting the Google tutorial to my purposes. I also made a file for the resource_prefetch_predictor_host_redirect blobs.
Note: to use these _pb2 files, you will need to install the protobuf library for Python. The easiest way to do this is with pip:
All that’s left is importing the respective file and parsing your Network Action Predictor database of choice:
From there, you can work on accessing only the data you need. You may want to print the value in ‘RPPO.host’ and then loop through all of the origins in origin and print those URLs too:
You may also want to decode the timestamp (which uses the same epoch as WebKit) with a function such as this:
Head over to ChrisTappin/Make-Resource-Prefetch-Predictor-Happen on GitHub if you want an example script that can parse the records from a database, either to read or produce a CSV.
Industry newsletter
Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think newsletter. See the IBM Privacy Statement.
Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.
IBM web domains
ibm.com, ibm.org, ibm-zcouncil.com, insights-on-business.com, jazz.net, mobilebusinessinsights.com, promontory.com, proveit.com, ptech.org, s81c.com, securityintelligence.com, skillsbuild.org, softlayer.com, storagecommunity.org, think-exchange.com, thoughtsoncloud.com, alphaevents.webcasts.com, ibm-cloud.github.io, ibmbigdatahub.com, bluemix.net, mybluemix.net, ibm.net, ibmcloud.com, galasa.dev, blueworkslive.com, swiss-quantum.ch, blueworkslive.com, cloudant.com, ibm.ie, ibm.fr, ibm.com.br, ibm.co, ibm.ca, community.watsonanalytics.com, datapower.com, skills.yourlearning.ibm.com, bluewolf.com, carbondesignsystem.com, openliberty.io