Cyberthreat News and CTI Extraction Techniques
Increasing number of cybersecurity-related articles and blogs are circulating the internet. These articles and blogs describe the latest vulnerabilities, cutting-edge attack techniques, attack prevention guidelines, and threat distribution. Research organizations and cybersecurity practitioners are also publishing content on various forums. In fact, there are now security experts who regularly post feeds on Twitter.
A textual representation of cyberthreat-related information includes malware descriptions, zero-day vulnerabilities, and attacker’s TTPs. These texts may also contain information on security vulnerabilities in the organization’s systems and network infrastructure.
CTI extraction techniques have been explored to help IT organizations develop tools to counter attackers’ malicious efforts. These techniques include extracting cybersecurity-related keywords from forum discussions and forum attachments. In addition, social media posts have also been used for CTI extraction. Using these techniques, researchers can identify cybersecurity related events, events of speciic types, and attacker motives.
Researchers have also used clustering techniques to identify threats and hackers. These techniques include k-means, DBSCAN, ainity propagation, and hierarchical clustering. Using these techniques, researchers can create clusters that combine text segments that share similar properties. The resulting clusters are then used to generate alerts. This technique is particularly useful when textual descriptions of cyberthreats are used as sources for CTI extraction.
CTI extraction is a great way to give security experts a deep understanding of the threat landscape. The process involves collecting texts, iltering the text to identify sentences that relate to cybersecurity, and then feeding the texts to machine learners to classify event categories.