Notifications
Clear all
Topic starter 17/08/2025 6:12 pm
Here’s a complete Python-based AI-powered Anti-Virus System that scans files for malicious content using machine learning. It includes training on labeled file metadata and content features, and then uses that model to detect and quarantine suspicious files.
🗂 Folder Structure
ai_antivirus/
├── data/
│ └── file_samples.csv
├── models/
│ └── av_model.pkl
├── quarantine/
├── src/
│ ├── __init__.py
│ ├── data_loader.py
│ ├── feature_extractor.py
│ ├── model_trainer.py
│ ├── scanner.py
│ └── quarantine.py
├── main.py
└── README.md
📄 File: data/file_samples.csv
Simulated file metadata and labels:
filename,size,entropy,label
safe1.txt,1024,3.2,clean
malware1.exe,2048,7.8,malicious
safe2.doc,512,2.9,clean
trojan.dll,4096,8.5,malicious
📄 File: src/data_loader.py
import pandas as pd
def load_data(path):
df = pd.read_csv(path)
X = df.drop(["filename", "label"], axis=1)
y = df["label"]
return X, y
📄 File: src/feature_extractor.py
import os
import math
def calculate_entropy(data):
if not data:
return 0
entropy = 0
for x in set(data):
p_x = data.count(x) / len(data)
entropy -= p_x * math.log2(p_x)
return round(entropy, 2)
def extract_features(file_path):
try:
with open(file_path, "rb") as f:
content = f.read()
size = os.path.getsize(file_path)
entropy = calculate_entropy(list(content))
return {
"size": size,
"entropy": entropy
}
except Exception as e:
print(f"Error extracting features: {e}")
return None
📄 File: src/model_trainer.py
from sklearn.ensemble import RandomForestClassifier
import joblib
def train_model(X, y, model_path):
model = RandomForestClassifier()
model.fit(X, y)
joblib.dump(model, model_path)
print(f"✅ Model saved to {model_path}")
📄 File: src/scanner.py
import joblib
from src.feature_extractor import extract_features
from src.quarantine import quarantine_file
MODEL_PATH = "models/av_model.pkl"
def scan_file(file_path):
features = extract_features(file_path)
if features:
model = joblib.load(MODEL_PATH)
prediction = model.predict([list(features.values())])[0]
print(f"🔍 Scanned {file_path}: {prediction}")
if prediction == "malicious":
quarantine_file(file_path)
📄 File: src/quarantine.py
import shutil
import os
QUARANTINE_DIR = "quarantine"
def quarantine_file(file_path):
if not os.path.exists(QUARANTINE_DIR):
os.makedirs(QUARANTINE_DIR)
filename = os.path.basename(file_path)
dest = os.path.join(QUARANTINE_DIR, filename)
shutil.move(file_path, dest)
print(f"🚨 File quarantined: {dest}")
📄 File: main.py
from src.data_loader import load_data
from src.model_trainer import train_model
from src.scanner import scan_file
DATA_PATH = "data/file_samples.csv"
MODEL_PATH = "models/av_model.pkl"
# Step 1: Train model
X, y = load_data(DATA_PATH)
train_model(X, y, MODEL_PATH)
# Step 2: Scan a file
scan_file("test_files/suspicious.exe")
📄 File: README.md
# AI Anti-Virus System
This project uses machine learning to detect and quarantine malicious files based on entropy and size.
## Features
- Trainable ML model
- Real-time file scanning
- Automatic quarantine of threats
## How to Use
1. Add labeled samples to `data/file_samples.csv`
2. Train the model with `main.py`
3. Place files in `test_files/` and scan them