A look at the 'resource_prefetch_predictor_origin' table in Chromium browsers

overhead view of a man sitting at a desk working on a computer with multiple screens

Author

Jairo Hibaler

Managing Consultant

My colleague, Chris Tappin, recently published a blog referencing an IBM X-Force Incident Response (IR) engagement. During the investigation, the team discovered what appeared to be a good source of information (or evidence) regarding user browser activities that may involve malicious domains being embedded into the websites they visited.

While Chris’s blog provides guidance on how to parse useful information from the Resource Prefetch Predictor tables in the Network Action Predictor database, this blog dives deeper to reveal additional information found during the investigation that could be useful for forensic and incident response teams.

What happened

NOTE: Some of the observations or findings shown below are from tests I have done to understand how the Network Action Predictor database stores data.

The incident involved two domains called apiexplorerzone[.]com and blackshelter[.]org, which have been observed to be associated with the delivery of SocGholish malware via the Keitaro Traffic Distribution System (TDS). This activity has also been observed to lead to the delivery of RansomHub ransomware.

The X-Force IR team quickly collected triage data and evidence from the affected endpoint and started the analysis. Nothing significant was observed apart from hits for the known indicators of compromise (IOCs) – apiexplorerzone[.]com and blackshelter[.]org – in the Network Action Predictor database file for Microsoft Edge.

screenshot demonstrating use of grep command to look for hits on the 2 IOCs
Figure 1: Using the grep command to look for hits on the 2 IOCs.

After searching for the two domains and getting the result in Figure 1 above, we initially thought that the user may have been searching for something and got redirected to both domains at some point. And as explained by Kevin Pagano in his blog about how Google Chrome predicts what you are looking for or what website you’re trying to visit, this is done by the browser by tracking that data in the Network Action Predictor database. However, we wanted to find out where exactly the two malicious domains are in the database. To do this, we opened the Network Action Predictor database using the DB Browser for SQLite and HxD, and our search pointed us to one of the tables in the database called ‘resource_prefetch_predictor_origin’. As to why or how those malicious domains ended up in that particular field in the table, that was something that we had to dig deeper into to understand.

The ‘resource_prefetch_
predictor_origin’ table

Not much is known about the ‘resource_prefetch_predictor_origin’ table in the Network Action Predictor database, except that another table with an almost similar name was mentioned in Kevin Pagano’s blog. Since the Microsoft Edge browser is also based on the Chromium source code, it is no surprise that the same table and database are present in the browser. However, based on the table’s name, it could be used to load pages faster, similar to what the Windows prefetch files do to optimize application loading times. The X-Force IR team had to do some testing to verify this and to understand how data is recorded and what kind of data is recorded in the table.

What we observed from the tests was the following:

  • Links that have been clicked by the user from the search results are recorded/stored in the table.
Screenshot of a sample search query and the results
Figure 2: Sample search query and the results.
screenshot of the clicked links from the search results.
Figure 3: Clicked links from the search results.
screenshot of evidence of clicked links stored in the 'resource_prefetch_predictor_origin' table.
Figure 4: Evidence of clicked links stored in the 'resource_prefetch_predictor_origin' table
  • Links or URLs that have been typed in the browser’s omnibox are also recorded/stored in the table.
screenshot of sample URL typed in the omnibox.
Figure 5: Sample URL typed in the omnibox.
screenshot of another sample URL typed in the omnibox.
Figure 6: Another sample URL typed in the omnibox.
screenshot of another sample URL typed in the omnibox.
Figure 7: Another sample URL typed in the omnibox.
screenshot of typed URLs successfully loading
Figure 8: Typed URLs successfully loading
screenshot of typed URLs stored in the 'resource_prefetch_predictor_origin' table
Figure 9: Typed URLs stored in the 'resource_prefetch_predictor_origin' table
  • Links embedded into the website’s source code using the <script src=””></script> HTML tags are stored in the protobuf blob.
Screenshot showing links recorded in 'resource_prefetch_predictor_origin' that are found in the webpage's source code using the <script></script> HTML tag
Figure 10: Links recorded in 'resource_prefetch_predictor_origin' that are found in the webpage's source code using the <script></script> HTML tag
Screenshot showing Links embedded into the webpage's source code using the <script></script> tag as seen in the 'resource_prefetch_predictor_origin' table
Figure 11: Links embedded into the webpage's source code using the <script></script> tag as seen in the 'resource_prefetch_predictor_origin' table
  • Records in the table are deleted when browsing data is cleared.
screenshot of Microsoft Edge browsing data about to be cleared
Figure 12: Microsoft Edge browsing data about to be cleared
Screenshot of stored data in the 'resource_prefetch_predictor_origin' table being deleted after clearing the browsing data
Figure 13: Stored data in the 'resource_prefetch_predictor_origin' table being deleted after clearing the browsing data

The observations above suggest that the user must have visited the website at least once for the links to be stored in the table. That said, it may indicate that the website the user visited may have been compromised and may have been used by threat actors to host malicious domains for redirection or execution of malicious scripts/domains.

The value of the data

In most investigations, we typically look at the user’s browsing history to understand how they may have stumbled upon those malicious domains. Typically for Chromium-based browsers, we immediately look at the History database file to see the links the user has visited and the ‘network_action_predictor’ table in the Network Action Predictor database for the user’s browsing behaviors. In addition to this, we also rely on network logs, specifically on proxy and firewall logs, to verify those activities.

screenshot of folder locations of both the History and Network Action Predictor databases
Figure 14: Folder locations of both the History and Network Action Predictor databases

With this rather newly found source of data, the information from this table could be used to support or supplement what we may have already found from other artifacts or pieces of evidence. Currently, though, there is no tool or script that we can use to quickly parse the table apart from CyberChef. Luckily, though, my colleague Chris Tappin developed a Python script that can quickly do the job for us and provide what we need, including other information that may be of interest for our investigation. Head over to his blog to learn more about the script. In the meantime, below are a couple of screenshots of the two outputs the script can provide according to preferences.

screenshot of standard output of Chris' script in Windows Terminal
Figure 15: Standard output of Chris' script in Windows Terminal
Screenshot of standard output of the script in .TXT format
Figure 16: Standard output of the script in .TXT format
Screenshot of standard output of the script in .TXT format
Figure 17: Standard output of the script in .TXT format. Each entry is prefixed with '-'
Screenshot showing the output from the 'resource_prefetch_predictor_host_redirect' table. Formatted for readability
Figure 18: One of the CSV files generated by the script. This file shows the output from the 'resource_prefetch_predictor_host_redirect' table. Formatted for readability
Screenshot showing the output from the 'resource_prefetch_predictor_origin' table. Formatted for readability
Figure 19: One of the CSV files generated by the script. This file shows the output from the 'resource_prefetch_predictor_origin' table. Formatted for readability

What’s good about the Python script Chris developed is, first and foremost, the readability of the entire output. We don’t have to navigate through the raw data in the table or scour through the binary data just to find the IOCs we’re trying to look for using either the DB Browser for SQLite or a hex editor, respectively. It will take us some time to find them, especially when there is a lot of data stored in the database. With either the standard or CSV output generated by the script, we can easily find the host URL and/or the IOCs we are looking for by grepping, using the Find tool (CTRL + F), or by any means available at our disposal.

In addition to that, we can easily create a timeline of events by looking at the ‘Last Visited’ timestamp found in both outputs. The only difference between the two is that those in the CSV output are still in WebKit timestamp format, so we still need to convert them to human-readable format if we choose to use that output (Figures 18 and 19). On the other hand, the ‘Last Visited’ timestamps in the standard output have been converted to a human-readable version and are in the Coordinated Universal Time (UTC) time standard (Figures 16 and 17).

Conclusion

It is pretty much known in the DFIR field that no two cybersecurity incidents are the same. There will always be differences in how incidents present themselves, and the same goes for our approach in getting the information we need to solve them. This is very much true in this incident handled by the X-Force IR team.

Due to some nuances on how the investigation developed, we discovered how valuable the information stored in the ‘resource_prefetch_predictor_origin’ table in the Network Action Predictor database is, and used it to understand how the incident transpired, considering that not much is really known about the database.

We hope that this article and the script developed by Chris help you in your cases as much as they helped us in ours. Cheers!

Would your team catch the next zero-day in time?

Join security leaders who rely on the Think Newsletter for curated news on AI, cybersecurity, data and automation. Learn fast from expert tutorials and explainers—delivered directly to your inbox. See the IBM Privacy Statement.

Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.

https://www.ibm.com/us-en/privacy
Related solutions
IBM Verify

Build a secure, vendor-agnostic identity framework that modernizes IAM, integrates with existing tools, and enables seamless hybrid access without added complexity.

Explore IBM verify
Threat detection response solutions

Accelerate response by prioritizing high-impact risks and automating remediation across teams.

Explore threat detection response solutions
IBM Cyber Threat Management

Predict, prevent, and respond to modern threats to strengthen business resilience.

Explore IBM cyber threat management
Take the next step

Discover how IBM Verify modernizes IAM by integrating with your existing tools to deliver secure, seamless hybrid identity access.

Discover IBM Verify Explore threat detection response solutions