Inconsistency Detector between Low-Quality Video and Audio Using Deepfake

  • Muhammad Taha
  • Shakil Ahmed Taha
  • Sana Ahmed Alam
  • Wazir Ahmed Ali
  • Asghar Ahmed Khan
  • Abdul Ahmed Kahliq
Keywords: Recurrent Neural Networks, Artificial Intelligence, Deepfake

Abstract

This study presents the challenges and issues within low-resolution video and audio through the application of sophisticated deep learning methodologies, addressing the prevalent issue of manipulated media in recent times. By employing dual-stream convolutional architecture, we analyze the intricate relationship between auditory and visual cues, commencing with an extensive examination of existing detection approaches and their constraints when confronted with substandard video content. Utilizing the VidTimit dataset as our base, we train and test our model and check its performance in comparison to existing pre-trained models. Our evaluation framework includes accuracy metrics, confusion matrices, and F1 score to ensure efficiency. Using a variety of filters such as Edge Preserving and Gaussian Blur on video data preprocessing, we enhance the detection of disparities by optimizing the input data. Model integration is the hallmark of our innovative dual-stream convolutional architecture, where audio and visual components are perfectly integrated. In this architecture, the visual stream applies convolutional layers to capture spatial characteristics from low-quality video frames, and the audio stream applies RNNs to capture temporal patterns in audio signals. The fusion module effectively integrates these streams and makes way for synchronization analysis and anomaly detection. In this aspect, our trained network does very well in active video detection and lip-reading tasks.

References

Borges, L., Martins, B. and Calado, P., 2019. Combining similarity features and deep representation learning for stance detection in the context of checking fake news. Journal of Data and Information Quality (JDIQ), 11(3), pp.1-26.
[2] Aldwairi, M. and Alwahedi, A., 2018. Detecting fake news in social media networks. Procedia Computer Science, 141, pp.215-222.
[3] Rana, M.S., Nobi, M.N., Murali, B. and Sung, A.H., 2022. Deepfake detection: A systematic literature review. IEEE access, 10, pp.25494-25513.
[4] Zhang, T., 2022. Deepfake generation and detection, a survey. Multimedia Tools and Applications, 81(5), pp.6259-6276.
[5] de Seta, G., 2021. Huanlian, or changing faces: Deepfakes on Chinese digital media platforms. Convergence, 27(4), pp.935-953.
[6] Dagar, D., & Vishwakarma, D. K. (2022). A literature review and perspectives in deepfakes: generation, detection, and applications. International journal of multimedia information retrieval, 11(3), 219-289.
[7] Ahmed, M., Bachmann, S., Martin, C., Walker, T., Rooyen, J., & Barkat, A. (2022). False Information as a Threat to Modern Society: A Systematic Review of False Information, Its Impact on Society, and Current Remedies. Journal of Information Warfare, 21(2), 105-120.
[8] Kingra, S., Aggarwal, N., & Kaur, N. (2023). Emergence of deepfakes and video tampering detection approaches: A survey. Multimedia Tools and Applications, 82(7), 10165-10209.
[9] Shelke, N. A., & Kasana, S. S. (2021). A comprehensive survey on passive techniques for digital video forgery detection. Multimedia Tools and Applications, 80, 6247-6310.
[10] Wang, J., Li, Z., Zhang, C., Chen, J., Wu, Z., Davis, L. S., & Jiang, Y. G. (2022). Fighting Malicious Media Data: A Survey on Tampering Detection and Deepfake Detection. arXiv preprint arXiv:2212.05667.
[11] Sanderson, C. The vidtimit database (No. REP_WORK). IDIAP. 2002.
[12] Cozzolino, D., Rössler, A., Thies, J., Nießner, M., & Verdoliva, L. Id-reveal: Identity-aware deepfake video detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. pp. 15108-15117.
[13] Khalil, S. S., Youssef, S. M., & Saleh, S. N. iCaps-Dfake: An integrated capsule-based model for deepfake image and video detection. Future Internet. 2021. 13(4), 93.
[14] Patel, Y., Tanwar, S., Bhattacharya, P., Gupta, R., Alsuwian, T., Davidson, I. E., & Mazibuko, T. F. An Improved Dense CNN Architecture for Deepfake Image Detection. IEEE Access. 2023. 11, 22081-22095.
[15] Patel, Y., Tanwar, S., Bhattacharya, P., Gupta, R., Alsuwian, T., Davidson, I. E., & Mazibuko, T. F. An Improved Dense CNN Architecture for Deepfake Image Detection. IEEE Access, 2023. 11, 22081-22095.
[16] Shahzad, S. A., Hashmi, A., Khan, S., Peng, Y. T., Tsao, Y., & Wang, H. M. Lip Sync Matters: A Novel Multimodal Forgery Detector. In 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). 2022. (pp. 1885-1892). IEEE.
[17] Korshunov, P., & Marcel, S. Vulnerability evaluation and detection of deepfake videos. In 2019 International Conference on Biometrics (ICB). 2019. (pp. 1-6). IEEE.
[18] Chung, J. S., & Zisserman, A. Out of time: automated lip sync in the wild. In Computer Vision–ACCV 2016 Workshops: ACCV 2016 International Workshops, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part II 13 (pp. 251-263). Springer International Publishing.
[19] Korshunov, P., & Marcel, S. Speaker inconsistency detection in tampered video. In 2018 26th European signal processing conference (EUSIPCO). 2018. (pp. 2375-2379). IEEE.
[20] Bhakat, S., & Ramakrishnan, G. (2019, January). Anomaly detection in surveillance videos. In Proceedings of the ACM India joint international conference on data science and management of data (pp. 252-255).
[21] Hoh, B., Gruteser, M., Xiong, H., & Alrabady, A. (2006). Enhancing security and privacy in traffic-monitoring systems. IEEE Pervasive Computing, 5(4), 38-46
[22] Chyan, P. (2019, October). Design of intelligent camera-based security system with image enhancement support. In Journal of Physics: Conference Series (Vol. 1341, No. 4, p. 042009). IOP Publishing.
[23] Pedapudi, S. M., & Vadlamani, N. (2023). Digital forensics approach for handling audio and video files. Measurement: Sensors, 29, 100860.
[24] Hangloo, S., & Arora, B. (2022). Combating multimodal fake news on social media: methods, datasets, and future perspective. Multimedia systems, 28(6), 2391-2422.
[25] Thomson, T. J., Angus, D., Dootson, P., Hurcombe, E., & Smith, A. (2022). Visual mis/disinformation in journalism and public communications: Current verification practices, challenges, and future opportunities. Journalism Practice, 16(5), 938-962.
[26] Pennycook, G., McPhetres, J., Zhang, Y., Lu, J. G., & Rand, D. G. (2020). Fighting COVID-19 misinformation on social media: Experimental evidence for a scalable accuracy-nudge intervention. Psychological science, 31(7), 770-780.
[27] Verdoliva, L. (2020). Media forensics and deepfakes: an overview. IEEE journal of selected topics in signal processing, 14(5), 910-932.
[28] Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4), 600-612.
[29] Kaprykowsky, H., Liu, M., & Ndjiki-Nya, P. (2009, November). Restoration of digitized video sequences: an efficient drop-out detection and removal framework. In 2009 16th IEEE International Conference on Image Processing (ICIP) (pp. 85-88). IEEE.
[30] Tyagi, S., & Yadav, D. (2023). A detailed analysis of image and video forgery detection techniques. The Visual Computer, 39(3), 813-833.
[31] Hu, L., Wei, S., Zhao, Z., & Wu, B. (2022). Deep learning for fake news detection: A comprehensive survey. AI open, 3, 133-155.
[32] Akhtar, Z. (2023). Deepfakes generation and detection: a short survey. Journal of Imaging, 9(1), 18.
[33] Mustak, M., Salminen, J., Mäntymäki, M., Rahman, A., & Dwivedi, Y. K. (2023). Deepfakes: Deceptions, mitigations, and opportunities. Journal of Business Research, 154, 113368.
[34] Romero Moreno, F. (2024). Generative AI and deepfakes: a human rights approach to tackling harmful
content. International Review of Law, Computers & Technology, 1-30.
Published
2025-08-09
How to Cite
Taha, M., Taha, S., Alam, S., Ali, W., Khan, A., & Kahliq, A. (2025). Inconsistency Detector between Low-Quality Video and Audio Using Deepfake. International Journal of Artificial Intelligence & Mathematical Sciences, 3(2), 1-13. https://doi.org/10.58921/ijaims.v3i2.122