DeepPoseNet: A Comprehensive Study on Human Pose Estimation with Deep Learning Technique
Abstract
The article is a thorough analysis on the deep learning techniques used in image processing to estimate human poses. This entails examining several essential architectures such as CNNs and why traditional methods are unfit. It clarifies attentive mechanisms and transfer learning parts. This approach uses a two stage CNN model, whereby first network identifies some body parts, while the other focuses on these identified body bits. We use an intricate VGG16 to pinpoint body parts with accuracy. These models are compared using benchmark data sets and performance measures of special interest in the application of the MPII dataset for model training as well as verification. Deep pose estimation has huge social and economic consequences. They include human-computer interaction, sports analysis, healthcare, and many more. Conclusion gives an outline of important insights made above, highlighting positive aspects identified as well as gaps that require additional research including call towards cooperation between disciplines for enhanced growth in this field.
References
Agarwal, A., & Triggs, B. (2004). Tracking articulated motion using a mixture of autoregressive models. In Computer Vision-ECCV 2004: 8th European Conference on Computer Vision, Prague, Czech Republic, May 11-14, 2004. Proceedings, Part III 8 (pp. 54-65). Springer Berlin Heidelberg
Toshev, A., & Szegedy, C. (2014). Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1653-1660).
Newell, A., Yang, K., & Deng, J. (2016). Stacked hourglass networks for human pose estimation. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII 14 (pp. 483-499). Springer International Publishing.
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., & Schiele, B. (2016). Deepercut: A deeper, stronger, and faster multi-person pose estimation model. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VI 14 (pp. 34-50). Springer International Publishing.
Lin, J., Wei, Z., Li, Z., Xu, S., Jia, K., & Li, Y. (2021). Dualposenet: Category-level 6d object pose and size estimation using dual pose network with refined learning of pose consistency. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3560-3569).
Pavlakos, G., Zhou, X., Derpanis, K. G., & Daniilidis, K. (2017). Coarse-to-fine volumetric prediction for single-image 3D human pose. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7025-7034).
Singh, A., Agarwal, S., Nagrath, P., Saxena, A., & Thakur, N. (2019, February). Human pose estimation using convolutional neural networks. In 2019 amity international conference on artificial intelligence (AICAI) (pp. 946-952). IEEE.
Belagiannis, V., & Zisserman, A. (2017, May). Recurrent human pose estimation. In 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017) (pp. 468-475). IEEE.
Wang, K., Lin, L., Jiang, C., Qian, C., & Wei, P. (2019). 3D human pose machines with self-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 42(5), 1069-1082.
Rogez, G., Weinzaepfel, P., & Schmid, C. (2019). Lcr-net++: Multi-person 2d and 3d pose detection in natural images. IEEE transactions on pattern analysis and machine intelligence, 42(5), 1146-1161.
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., & Schiele, B. (2016). Deepercut: A deeper, stronger, and faster multi-person pose estimation model. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VI 14 (pp. 34-50). Springer International Publishing.