Detection of Morphed Face, Body, Audio signals using Deep Neural Networks

Abstract

Deepfakes have been a hot topic in the field of deep learning. It is typically used to alter the face or body of a person to create a fake image or video. With the rise of internet, the number of fake content especially deepfakes have increased exponentially. There have already been cases of these causing conflict and hatred among people. To keep this misinformation regulated, there needs to be a way to distinguish deepfakes from the rest. We therefore have come up with a model to classify deepfakes from pristine, accurately and quickly, so that anyone can upload an image/video to know whether it is genuine or not. The parameters taken into consideration for classifying deepfakes are face, audio and body language. The model for face consists of MMOD-CNN Face detector for pre-processing the input, which is then passed on to a Temporal Convolutional Network (TCN) to predict. For audio deepfake detection, audio converted into a spectrogram is passed to a ResNet50V2 followed by a TCN to predict. The Body Language model uses a vanilla TCN to predict if its a deepfake video or not.

Publication
In 7th International Conference for Convergence in Technology 2022
Dheeraj Gharde
Dheeraj Gharde
MS CS - USC

My interests include fullstack development and software infrastructure engineering.