The NSL-KDD Feature Extractor is a Python-based tool designed to process network traffic packets and extract features compliant with the NSL-KDD dataset format. It enables researchers and developers to analyze network traffic and apply machine learning models for intrusion detection, anomaly detection, or other cybersecurity applications.
Dataset used for testing: https://www.kaggle.com/datasets/hassan06/nslkdd/data
-
Packet Analysis
- Supports live packet capture using
scapy
. - Processes TCP, UDP, ICMP, ARP, and DNS packets.
- Supports live packet capture using
-
Feature Extraction
- Generates NSL-KDD dataset-compatible features for machine learning.
- Includes connection-based and statistical features such as
same_srv_rate
,srv_serror_rate
, and more.
-
Customizable and Scalable
- Easily extendable for new protocols or custom features.
- Handles both live traffic and offline packet captures.
-
Internal Traffic Filtering
- Option to exclude internal traffic during feature extraction.
+------------------+
| Network Traffic|
+------------------+
|
v
+-------------------------------+
| Packet Capturing |
| (Using Scapy Framework) |
+-------------------------------+
|
v
+----------------------------------------+
| NSL-KDD Feature Extraction |
| (network_feature_extractor.py) |
+----------------------------------------+
|
v
+----------------------------------------+
| Generated Feature Set |
| - Duration, Protocol Type, Service |
| - Flag, Src Bytes, Dst Bytes |
| - Statistical Features (e.g., |
| srv_serror_rate, same_srv_rate) |
+----------------------------------------+
- Python 3.11 or later
- Scapy for packet capture
- Pandas for data manipulation
-
Clone the repository:
git clone https://github.com/Ate329/NSL-KDD-feature-extractor.git cd nsl-kdd-feature-extractor
-
Install required dependencies:
pip install -r requirements.txt
from network_feature_extractor import NetworkFeatureExtractor
# Initialize the extractor
extractor = NetworkFeatureExtractor(interface="eth0", timeout=60)
# Capture live traffic and extract features
def process_packet(packet):
features = extractor.extract_features(packet)
if features:
print(features)
extractor.start_capture(callback=process_packet)
Extracted features will include:
{
"duration": 1.23,
"protocol_type": "tcp",
"service": "http",
"flag": "SF",
"src_bytes": 345,
"dst_bytes": 512,
"same_srv_rate": 0.75,
"srv_serror_rate": 0.0,
...
}
-
Add New Features:
- Extend the
extract_features()
method to compute additional metrics.
- Extend the
-
Handle Custom Protocols:
- Add specific processing for protocols like DNS or HTTP in
_extract_ip_features()
or_extract_arp_features()
.
- Add specific processing for protocols like DNS or HTTP in
-
Exclude Internal Traffic:
- Enable internal traffic detection using the
detect_internal=True
parameter.
- Enable internal traffic detection using the
- This feature extractor aligns with the NSL-KDD dataset specification, enabling seamless integration with machine learning models trained on similar datasets.
- The modular structure makes it adaptable for other datasets or real-world scenarios.
We welcome contributions! If you’d like to extend the functionality or report a bug, feel free to submit a pull request or open an issue.
This project is licensed under the MIT License. See the LICENSE file for more details.