STIX 2.0 IoC export with Python

The “Structured Threat Information Expression” (STIX) format is a way to represent and structure threat intelligence. Information provided in STIX format can for example be used to enrich security software with current indicators of compromise.

In other words, if you want your Intrusion Detection System (IDS), Intrusion Prevention System (IPS) or Security Information and Event Management (SIEM) system to discover and alert on potentially dangerous events in your network, you first need to tell them what “dangerous” means. For example, if there is a new way of attacking Windows servers, you would want your security systems to know these “ways” (attack vectors).

Some common data that would be loaded into your IDS / IPS / SIEM would be file hashes or URLs: If you see that a system in your network is trying to connect to a specific URL (e.g. www.malicious.evil) or if you see a file with a hash that is known to be a a malware file, you would have a strong indicator that something in your network was compromised. For this reason, such known-to-be-malicious information is called “Indicator fo Compromise” (IoC). STIX is a way to collect IoCs, to describe them and also to set them into relation. More information can be found on GitHub.io.

Now, if you simply want to load STIX data into your security systems, there is a ready-made protocol to load such data: The “Trusted Automated Exchange of Intelligence Information” (TAXII). However, sometimes you simply want to extract all IoCs from one STIX repository, discard any metadata and work just with the plain IoCs. Since STIX 2.0 is formatted in JSON, it is very easy to accomplish this.

The following Python script will purge your STIX 2.0 data – stored in one or several JSON files – of all meta information and what remains are the pure IoCs (URLs, hashes, etc.) written into the file “iocs.txt”. I specifically wrote this script for the STIX 2.0 elements used on the X-Force Exchange threat intelligence platform. However, it should be easy enough to adjust it to your needs.

##################################################
# Exports IoCs from STIX 2.0 files and writes
# these into a text file.
# Parsing is optimized for STIX 2.0 files from
# https://exchange.xforce.ibmcloud.com
##################################################

__author__ = "Fabian Voith"
__version__ = "1.0.1"
__email__ = "admin@fabian-voith.de"

import json
import sys

# By using sets, we avoid having duplicate
# IoCs in our output file
fileNames = set()
iocData = set()
outputFile = 'iocs.txt'

# If no files were indicated or if we are running in Jupyter Notebook
# (i.e. first parameter = "-f"), 
# then we use a default filename "stix.json"
if len(sys.argv)<=1 or sys.argv[1]=="-f":
    fileNames.add("stix.json")
else:
    for x in sys.argv[1:]:
        fileNames.add(x)
        
try:
    for fileName in fileNames:
        with open(fileName) as json_data:
            data = json.load(json_data)
            print(str(len(data["objects"])) + " IoCs found in " + fileName)

            for ioc in data["objects"]:
                item = ioc["pattern"]
                iocData.add(item.split(" ")[3].replace("'", ""))

    with open(outputFile, "w") as f:
        for ioc in iocData:
            f.write("%s\n" % ioc)
except Exception as e:
    print("An error occured: " + str(e))

print(str(len(iocData)) + " unique IoCs written to " + outputFile)

For example, if you download STIX data on threat actors that exploit the current Corona pandemic (requires free registration), you will get a file containing all meta data:




Running the script…


… results in a file that only contains all unique IoCs (with duplicates removed), ready for further usage:

Leave A Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.