Update 29/10/2020: A blog reader raised awareness that the format of the X-Force Exchange STIX export had slightly changed. The Python script was adjusted accordingly.
The “Structured Threat Information Expression” (STIX) format is a way to represent and structure threat intelligence. Information provided in STIX format can for example be used to enrich security software with current indicators of compromise.
In other words, if you want your Intrusion Detection System (IDS), Intrusion Prevention System (IPS) or Security Information and Event Management (SIEM) system to discover and alert on potentially dangerous events in your network, you first need to tell them what “dangerous” means. For example, if there is a new way of attacking Windows servers, you would want your security systems to know these “ways” (attack vectors).
Some common data that would be loaded into your IDS / IPS / SIEM would be file hashes or URLs: If you see that a system in your network is trying to connect to a specific URL (e.g. www.malicious.evil) or if you see a file with a hash that is known to be a a malware file, you would have a strong indicator that something in your network was compromised. For this reason, such known-to-be-malicious information is called “Indicator fo Compromise” (IoC). STIX is a way to collect IoCs, to describe them and also to set them into relation. More information can be found on GitHub.io.
Now, if you simply want to load STIX data into your security systems, there is a ready-made protocol to load such data: The “Trusted Automated Exchange of Intelligence Information” (TAXII). However, sometimes you simply want to extract all IoCs from one STIX repository, discard any metadata and work just with the plain IoCs. Since STIX 2.0 is formatted in JSON, it is very easy to accomplish this.
The following Python script will purge your STIX 2.0 data – stored in one or several JSON files – of all meta information and what remains are the pure IoCs (URLs, hashes, etc.) written into the file “iocs.txt”. I specifically wrote this script for the STIX 2.0 elements used on the X-Force Exchange threat intelligence platform. However, it should be easy enough to adjust it to your needs.
For example, if you download STIX data on threat actors that exploit the current Corona pandemic (requires free registration), you will get a file containing all meta data:
Running the script…
… results in a file that only contains all unique IoCs (with duplicates removed), ready for further usage: