Deep Learning Approaches Based on Transformer Architectures for Image Captioning Tasks