BanglaNoise-10: A Diverse Audio Dataset for Machine Learning on Urban Environmental Noise in Bangladesh
Data in Brief (Under Review) • 2025
BanglaNoise-10: A Diverse Audio Dataset for Machine Learning on Urban Environmental Noise in Bangladesh
Authors: Md. Nasir Uddin, Md. Mehedi Hasan, and Mohammad Shahidur Rahman
Affiliation: Department of CSE, Shahjalal University of Science and Technology, Bangladesh
Abstract
Urban environmental noise data are critical for developing and evaluating audio analysis systems, yet publicly available datasets reflecting real-world urban noise in Bangladesh are limited. This data article presents BanglaNoise-10, an environmental audio dataset supporting research on urban noise analysis and sound processing.
Dataset Specifications
| Specification | Details |
|---|---|
| Total Recordings | 5,035 |
| Duration per clip | 10 seconds |
| Format | WAV (16 kHz, mono) |
| Categories | 10 urban noise classes |
| License | CC BY 4.0 |
Dataset Classes
The dataset covers ten urban noise categories commonly observed in Bangladeshi cities:
- Bike - Motorcycles, engine noise, acceleration, horn usage
- Bus - Engine sounds, barking, acceleration, terminal noise
- Car - Passenger vehicles, engine operation, horn usage
- CNG Auto-rickshaw - Distinctive South Asian transport sounds
- Construction Noise - Building sites, machinery
- Protest - Culturally specific protest acoustics
- Siren - Emergency vehicle sounds
- Traffic Jam - Multi-layered dense traffic
- Train - Railway sounds
- Truck - Heavy vehicle noise
Data Collection
Recording Devices
- Samsung Galaxy S22 Ultra
- Oppo A15
- Realme C35
Collection Locations
Data collected across multiple regions of Bangladesh:
- Sylhet, Dhaka, Chattogram, Rajshahi
- Khulna, Barishal, Rangpur, Mymensingh
- Bandarban, Chandpur
Collection Period
February - November 2025 (10 months)
Machine Learning Baselines
Model Performance
| Model | Accuracy | F1-Score |
|---|---|---|
| Whisper-Base | 98% | High |
| Wav2Vec2-Base | 97% | High |
| CNN | 90% | Moderate |
Training Configuration
Wav2Vec2-Base:
- Raw 16 kHz mono audio waveforms
- 10 epochs, AdamW optimizer
- Learning rate:
- 80/20 stratified train-test split
Whisper-Base:
- Log-Mel spectrogram features
- 10 epochs, AdamW optimizer
- Learning rate:
Value of the Data
- First South Asian Dataset: First publicly available, large-scale environmental noise dataset for Bangladesh
- Unique Regional Signatures: CNG auto-rickshaw, dense traffic jams, culturally specific protest acoustics
- High Learnability: 98% accuracy demonstrates strong class separability
- Raw Format: Unfiltered 16-bit WAV preserves full spectral content
- Urban Computing Applications: Smart-city analytics, noise monitoring, health impact studies
Data Availability
- Repository: Zenodo
- DOI: 10.5281/zenodo.8239067
- URL: https://zenodo.org/record/8239067
CRediT Author Statement
- Md. Nasir Uddin: Data Collection, Preprocessing, Validation, Visualization
- Md. Mehedi Hasan: Conceptualization, Data Curation, Software, Visualization, Writing—Original Draft
- Mohammad Shahidur Rahman: Supervision, Project Administration, Resources
Keywords
Urban sound dataset, Audio dataset, Bangladesh, Sound classification, Machine learning, Acoustic analysis, Wav2Vec2, Whisper, Deep learning