Technology

Technology

Intelligent Memory Based Obfuscated Malware Detector

Nov 30, 2024

|

0

min read

Intelligent Memory-Based Obfuscated Malware Detector

This repository contains the implementation and documentation for the Intelligent Memory-Based Obfuscated Malware Detector, a project developed as part of the Bachelor of Technology degree in Computer Science Engineering at Jaypee Institute of Information Technology, Noida.

Overview

Modern malware frequently employs obfuscation techniques to evade detection by traditional systems. This project addresses these challenges by developing a Memory-Based Explainable Obfuscated Malware Detector, leveraging advanced machine learning techniques and explainable AI methodologies.

The system is lightweight, efficient, and transparent, providing both high accuracy and interpretability in its malware detection process.


Features

  • Memory-Based Analysis: Utilizes memory dumps to detect obfuscated malware.

  • Explainable AI: Explains decisions using SHAP (SHapley Additive exPlanations).

  • Lightweight Design: Employs Recursive Feature Elimination (RFE) to select only five key features for detection.

  • User-Friendly Interface: Built with Python's Streamlit for real-time user interaction.

Dataset

The system is tested on the MalMem2022 dataset, which includes 58,596 instances split evenly between benign and malware samples. Features are extracted using the Volatility Framework and cover memory-specific characteristics like:

  • Number of running processes.

  • Average threads per process.

  • Number of DLLs loaded.

Technology Stack

Programming Language: Python

  • Libraries Used:

    • NumPy: Numerical computations.

    • Pandas: Data manipulation and analysis.

    • Matplotlib: Visualization.

    • Scikit-Learn: Machine learning and evaluation tools.

    • XGBoost: Gradient boosting algorithms.

    • SHAP: Explainability of model predictions.

    • Framework: Streamlit (for UI development).


Algorithms Used

  1. Recursive Feature Elimination (RFE): For feature selection.

  2. Machine Learning Models:

  • Random Forest

  • Decision Tree

  • Gaussian Naive Bayes

  • Extreme Gradient Boost (XGBoost)

  1. 10-Fold Cross-Validation: To validate model generalization.


Github Repo : https://github.com/divyanshwrite/obfucated-malware-detector?tab=readme-ov-file#overview

Contributors Divyansh Singh Parihar Ayush Kumar Suraj Prakash Nishant Singh Yash Sharma

Subscribe To Out Newsletter

Subscribe To Out Newsletter

Get the latest tech insights delivered directly to your inbox!

Subscribe To Out Newsletter

Share It On:

© 2024 Digital Frontier Digest.

Designed & Developed By Digital Frontier Digest

© 2024 Digital Frontier Digest.

Designed & Developed By Digital Frontier Digest

© 2024 Digital Frontier Digest.

Designed & Developed By Digital Frontier Digest