Malware Memory Analysis | Datasets | Canadian Institute for Cybersecurity | UNB

Global Site Navigation (use tab and down arrow)

Canadian Institute for Cybersecurity

Malware Memory Analysis

CIC-MalMem-2022

Obfuscated malware is malware that hides to avoid detection and extermination. The obfuscated malware dataset is designed to test obfuscated malware detection methods through memory. The dataset was created to represent as close to a real-world situation as possible using malware that is prevalent in the real world. Made up of Spyware, Ransomware and Trojan Horse malware, it provides a balanced dataset that can be used to test obfuscated malware detection systems.

This dataset uses debug mode for the memory dump process to avoid the dumping process to show up in the memory dumps. This works to represent a more accurate example of what an average user would have running at the time of a malware attack.

1. Introduction

The obfuscated malware dataset focuses on simulation of real-world scenarios. Figure 1 shows the breakdown of benign and malicious memory dumps. Figure 2 shows the breakdown of what malware families are used in each malware category for Spyware (a), Ransomware (b), and Trojan Horse (c) malware. Figure 3 shows the overall malware families used in the whole dataset.

Figure 1: Memory Dump Categories

 

Figure 2A: Spyware Families

 

Figure 2B: Ransomware Families

 

Figure 2C: Trojan Horse Families

 

Figure 3: Complete dataset breakdown


2. Dataset details

The dataset is balanced with it being made up by 50% malicious memory dumps and 50% benign memory dumps. The break down for malware families is shown in the table below. The dataset contains a total of 58,596 records with 29,298 benign and 29,298 malicious. Figure 4 shows the total count of each malware family from each malware category.

Malware category Malware families Count
Trojan Horse
  • Zeus
  • Emotet
  • Refroso
  • scar
  • Reconyc
  • 195
  • 196
  • 200
  • 200
  • 157
Spyware
  • 180Solutions
  • Coolwebsearch
  • Gator
  • Transponder
  • TIBS
  • 200
  • 200
  • 200
  • 241
  • 141
Ransomware
  • Conti
  • MAZE
  • Pysa
  • Ako
  • Shade
  • 200
  • 195
  • 171
  • 200
  • 220

Figure 4: Malware Table Breakdown


License

The dataset is balanced with it being made up by 50% malicious memory dumps and 50% benign memory dumps. The break down for malware families is shown in the table below. The dataset contains a total of 58,596 records with 29,298 benign and 29,298 malicious.

Tristan Carrier, Princy Victor, Ali Tekeoglu, Arash Habibi Lashkari,” Detecting Obfuscated Malware using Memory Feature Engineering”, The 8th International Conference on Information Systems Security and Privacy (ICISSP), 2022

Download the dataset