Publication

Recognizing Food Places in Egocentric Photo-Streams Using Multi-Scale Atrous Convolutional Networks and Self-Attention Mechanism

Sarker, M. M. K., Rashwan, H. A., Akram, F., Talavera, E., Banu, S. F., Radeva, P. & Puig, D., 2019, In : IEEE Access. 7, p. 39069-39082 14 p.

Research output: Contribution to journalArticleAcademicpeer-review

APA

Sarker, M. M. K., Rashwan, H. A., Akram, F., Talavera, E., Banu, S. F., Radeva, P., & Puig, D. (2019). Recognizing Food Places in Egocentric Photo-Streams Using Multi-Scale Atrous Convolutional Networks and Self-Attention Mechanism. IEEE Access, 7, 39069-39082. https://doi.org/10.1109/ACCESS.2019.2902225

Author

Sarker, Md Mostafa Kamal ; Rashwan, Hatem A. ; Akram, Farhan ; Talavera, Estefania ; Banu, Syeda Furruka ; Radeva, Petia ; Puig, Domenec. / Recognizing Food Places in Egocentric Photo-Streams Using Multi-Scale Atrous Convolutional Networks and Self-Attention Mechanism. In: IEEE Access. 2019 ; Vol. 7. pp. 39069-39082.

Harvard

Sarker, MMK, Rashwan, HA, Akram, F, Talavera, E, Banu, SF, Radeva, P & Puig, D 2019, 'Recognizing Food Places in Egocentric Photo-Streams Using Multi-Scale Atrous Convolutional Networks and Self-Attention Mechanism', IEEE Access, vol. 7, pp. 39069-39082. https://doi.org/10.1109/ACCESS.2019.2902225

Standard

Recognizing Food Places in Egocentric Photo-Streams Using Multi-Scale Atrous Convolutional Networks and Self-Attention Mechanism. / Sarker, Md Mostafa Kamal; Rashwan, Hatem A.; Akram, Farhan; Talavera, Estefania; Banu, Syeda Furruka; Radeva, Petia; Puig, Domenec.

In: IEEE Access, Vol. 7, 2019, p. 39069-39082.

Research output: Contribution to journalArticleAcademicpeer-review

Vancouver

Sarker MMK, Rashwan HA, Akram F, Talavera E, Banu SF, Radeva P et al. Recognizing Food Places in Egocentric Photo-Streams Using Multi-Scale Atrous Convolutional Networks and Self-Attention Mechanism. IEEE Access. 2019;7:39069-39082. https://doi.org/10.1109/ACCESS.2019.2902225


BibTeX

@article{2c8909bbb1f44761af5bcacbb1d61d53,
title = "Recognizing Food Places in Egocentric Photo-Streams Using Multi-Scale Atrous Convolutional Networks and Self-Attention Mechanism",
abstract = "Wearable sensors (e.g., lifelogging cameras) represent very useful tools to monitor people's daily habits and lifestyle. Wearable cameras are able to continuously capture different moments of the day of their wearers, their environment, and interactions with objects, people, and places reflecting their personal lifestyle. The food places where people eat, drink, and buy food, such as restaurants, bars, and supermarkets, can directly affect their daily dietary intake and behavior. Consequently, developing an automated monitoring system based on analyzing a person's food habits from daily recorded egocentric photo-streams of the food places can provide valuable means for people to improve their eating habits. This can be done by generating a detailed report of the time spent in specific food places by classifying the captured food place images to different groups. In this paper, we propose a self-attention mechanism with multi-scale atrous convolutional networks to generate discriminative features from image streams to recognize a predetermined set of food place categories. We apply our model on an egocentric food place dataset called {"}EgoFoodPlaces{"} that comprises of 43 392 images captured by 16 individuals using a lifelogging camera. The proposed model achieved an overall classification accuracy of 80% on the {"}EgoFoodPlaces{"} dataset, respectively, outperforming the baseline methods, such as VGG16, ResNet50, and InceptionV3.",
keywords = "Food places recognition, scene classification, self-attention model, atrous convolutional networks, egocentric photo-streams, visual lifelogging, SCENE, CLASSIFICATION, OBESITY",
author = "Sarker, {Md Mostafa Kamal} and Rashwan, {Hatem A.} and Farhan Akram and Estefania Talavera and Banu, {Syeda Furruka} and Petia Radeva and Domenec Puig",
year = "2019",
doi = "10.1109/ACCESS.2019.2902225",
language = "English",
volume = "7",
pages = "39069--39082",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC",

}

RIS

TY - JOUR

T1 - Recognizing Food Places in Egocentric Photo-Streams Using Multi-Scale Atrous Convolutional Networks and Self-Attention Mechanism

AU - Sarker, Md Mostafa Kamal

AU - Rashwan, Hatem A.

AU - Akram, Farhan

AU - Talavera, Estefania

AU - Banu, Syeda Furruka

AU - Radeva, Petia

AU - Puig, Domenec

PY - 2019

Y1 - 2019

N2 - Wearable sensors (e.g., lifelogging cameras) represent very useful tools to monitor people's daily habits and lifestyle. Wearable cameras are able to continuously capture different moments of the day of their wearers, their environment, and interactions with objects, people, and places reflecting their personal lifestyle. The food places where people eat, drink, and buy food, such as restaurants, bars, and supermarkets, can directly affect their daily dietary intake and behavior. Consequently, developing an automated monitoring system based on analyzing a person's food habits from daily recorded egocentric photo-streams of the food places can provide valuable means for people to improve their eating habits. This can be done by generating a detailed report of the time spent in specific food places by classifying the captured food place images to different groups. In this paper, we propose a self-attention mechanism with multi-scale atrous convolutional networks to generate discriminative features from image streams to recognize a predetermined set of food place categories. We apply our model on an egocentric food place dataset called "EgoFoodPlaces" that comprises of 43 392 images captured by 16 individuals using a lifelogging camera. The proposed model achieved an overall classification accuracy of 80% on the "EgoFoodPlaces" dataset, respectively, outperforming the baseline methods, such as VGG16, ResNet50, and InceptionV3.

AB - Wearable sensors (e.g., lifelogging cameras) represent very useful tools to monitor people's daily habits and lifestyle. Wearable cameras are able to continuously capture different moments of the day of their wearers, their environment, and interactions with objects, people, and places reflecting their personal lifestyle. The food places where people eat, drink, and buy food, such as restaurants, bars, and supermarkets, can directly affect their daily dietary intake and behavior. Consequently, developing an automated monitoring system based on analyzing a person's food habits from daily recorded egocentric photo-streams of the food places can provide valuable means for people to improve their eating habits. This can be done by generating a detailed report of the time spent in specific food places by classifying the captured food place images to different groups. In this paper, we propose a self-attention mechanism with multi-scale atrous convolutional networks to generate discriminative features from image streams to recognize a predetermined set of food place categories. We apply our model on an egocentric food place dataset called "EgoFoodPlaces" that comprises of 43 392 images captured by 16 individuals using a lifelogging camera. The proposed model achieved an overall classification accuracy of 80% on the "EgoFoodPlaces" dataset, respectively, outperforming the baseline methods, such as VGG16, ResNet50, and InceptionV3.

KW - Food places recognition

KW - scene classification

KW - self-attention model

KW - atrous convolutional networks

KW - egocentric photo-streams

KW - visual lifelogging

KW - SCENE

KW - CLASSIFICATION

KW - OBESITY

U2 - 10.1109/ACCESS.2019.2902225

DO - 10.1109/ACCESS.2019.2902225

M3 - Article

VL - 7

SP - 39069

EP - 39082

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

ER -

ID: 102146687