Publications

MIDIVIS: Effective Music Visualization For Exploring And Evaluating Generated Alternatives In Computer-Assisted Composition

MIDI-Vis, a MIDI visualization tool is proposed with two exploratory functions: 1) a MIDIComp view which allows the user to visually compare the symbolic content of MIDI files. 2) MIDICluster, a t-Distributed Stochastic Neighbor Embedding (t-SNE) view to explore MIDI clusters by discriminating useful musical dimensions from MIDI files. A web-based, fully responsive system is developed using P5.js and Node.js. It is tested on MIDI corpus data, including via an integration with Calliope, computer-assisted composition (CAC) system which helps generate batches of musical variations of a given MIDI source file.

To be submitted at NIME.

Classification of ISL using Pose and Object Detection based Techniques

In this project, we want to bridge the communication gap for the hearing-impaired and thereby contribute to the development of suitable habitats for them by presenting two approaches for the classification of Indian Sign Language: (a) the object detection-based approach utilizes a model built on Scaled-YOLOv4 architecture which performs a frame-by-frame inference and (b) the Pose-based approach utilizes an LSTM model which takes the skeletal pose landmarks from Mediapipe for a sequence of frames as an input to infer and predict the action. The usage of Mediapipe to collect landmarks enhanced the LSTM module’s accuracy to around 98% for 8 classes. However, it was found from experimentation that this approach is not very scalable due to the drastic fall in model performance with increase in number of classes. The object detection route allows us to train far more number of classes (about thrice) on the Scaled-YOLOv4 Architecture with only little impact to performance with rise in number of classes with a final accuracy of 95.9% for 25 classes.

Accepted in SmartCom 2023.

Deep-Learning Spatiotemporal Prediction Framework for Particulate Matter under Dynamic Monitoring

Published in Transportation Research Record: Journal of the Transportation Research Board, 2022

A spatiotemporal prediction of hourly particulate matter with different deep-learning modeling techniques for Delhi, India was performed. The secondary data of particulate matter concentrations and the meteorological parameters for the four static monitors in the area are collected from Central Pollution Control Board (CPCB) for dates between January 2019 and April 2021. Three models with convolutional neural network (CNN), long short-term memory (LSTM), and CNN-LSTM are developed for a total of 15 hexagonal cells. The predictions are accurate for the CNN-LSTM model compared with the values obtained from the static monitor. Also, compared with the existing and individual models, the proposed hybrid CNN-LSTM model performed better for most of the cells.

Mittal, V., Sasetty, S., Choudhary, R., & Agarwal, A. (2022). Deep-Learning Spatiotemporal Prediction Framework for Particulate Matter under Dynamic Monitoring. Transportation Research Record, 2676(8), 56-73. https://journals.sagepub.com/doi/10.1177/03611981221082589