Matt Coler: Contributing to changing how humans and machines communicate
|Date:||14 January 2022|
Director and associate professor of MSc Voice Technology Matt Coler talks about his interdisciplinary background and fascination with speech and voice technology. Learn more about the field and Matt’s personality, and find out if you may be the right fit for the Master!
How did your career path unfold?
“My academic career began in a typically interdisciplinary way: I started out my undergraduate studies as a physics major before switching tracks in my final year and graduating with a bachelor’s in philosophy with a minor in Mandarin Chinese. After a hiatus working in Taiwan, I started my graduate studies in Linguistics at NYU and finished in Amsterdam with cum laude. I defended my PhD in linguistics at the VU Amsterdam in 2010. My work involved recording and analyzing the audio of a previously undescribed language in remote Peruvian highlands. After graduating, I accepted a postdoctoral research position at VU Amsterdam. Next I joined a tech start-up as Department Head of a team of engineers and social scientists modeling acoustic perception for cognitive sensors. Working in industry infused my research with fresh energy and focus on social impact. Collaborating with partners from industry and academia shaped my focus: interdisciplinary research in speech and voice technology. I transitioned back to academia when I joined the then-fledgling faculty Campus Fryslân. This new environment was the catalyst with which I, together with the Development Team and partners from the private sector, launched the new Master’s in Voice Technology.”
What makes speech so interesting?
“For me speech is a fascinating object of study because it is so interdisciplinary. On one hand, speech involves cognitive phenomena, including recognition, understanding, meaning, and, in short, a situated theory of mind. On the other hand, it is biomechanical, involving the physiological mechanisms of auditory and visual perception. It is also physical and can be represented as a waveform or spectrogram. On top of that, speech can be studied from the perspective of an array of different linguistic sub-disciplines. Finally, speech and voice is also deeply personal and individual, even intimate. It expresses the subtleties of our mood, the texture of our personality, and encompasses a sense of self. This complexity is precisely what makes this topic so compelling -- and augmenting these scientific studies with engineering and technically-informed approaches is the cornerstone of the MSc Voice Technology.”
What can be achieved using voice technology?
“While the scientific and intellectual motivations for the MSc Voice Technology are exciting I am triggered by the social impact factor. Voice tech is so much more than Alexa and Siri. Those devices are the tip of the iceberg -- essentially sophisticated techniques for selling products. For me, voice technology holds promise for society. Voice technology has remarkable potential for projects relating to health, for example, recognizing oral cancer or certain neurodegenerative diseases. It can help make the way humans and machines interact become more natural. Further it can also be an important resource to support minoritized communities, providing tools for speakers of under-resourced languages. I believe that achieving this vision is a joint effort.That’s why I partner with colleagues from other universities and the private sector to bring in state-of-the-art techniques and pool resources. For me this is not only in the best interest of the students in the programme, but also for the field itself; as with any scientific endeavor, collaboration is key.”
Share some insights into the on-going PhD research projects
“The MSc Voice Technology team is composed not just of lecturers and professors, but also PhD students. For example, my colleague Phat Do is working on voice synthesis for Frisian. If you want to make a synthetic voice for a well-resourced language like English or French there is so much data available. This is not the case for Frisian, so Phat has carried out an experiment to augment the Frisian data set with data from other languages, including closely related languages like English and Dutch but also totally unrelated ones like Chinese. Results are very encouraging! The PhD project will culminate in a natural synthetic voice for Frisian, available for free to the general public. It could be used by people with visual disabilities or a voicing issue. Another new project on voice recognition is performed by Xiyuan Gao who is working on sarcasm detection. The goal of her research is to help us interact with the devices more naturally. People often speak in a non-literal way, mixing in sarcasm and irony, for example. But our devices fail to recognize it -- until now! A new PhD student, Frank Hopwood, will start in February. He will work on a voice tech project with under-resourced Dutch languages like Frisian or Gronings.”
Who is the MSc Voice Technology for?
“There are a range of student profiles that can fit in the program. I would underscore that what matters more than academic profile is genuine interest and commitment to the topic, although at the same time it is important to have a willingness to acquire skills like programming and even mathematics which is a part of the programme, coupled with the sensitivity to language or melody and even music, and an appreciation for the arts. While most students have abilities and interests either on the technical / computational side or the social / linguistic side, as long as the individual is willing to explore the uncharted territory and go out of their comfort zone, they can succeed in the program. The programme is flexible enough to help students find what works for them. The ambition is for everybody to acquire the expertise in the field in accordance with their interests and abilities and acquire a job in the field once they graduate. That said, I would emphasize that this isn’t a cakewalk, It’s an intensive one year programme and we cover a lot of ground!”
More information about MSc Voice Technology!
What is your typical day like?
“There is no such thing as a typical day in academia. It is a dynamic position. I balance teaching and research, and I am also a director of the graduate school. At the moment most of my time is dedicated to the Voice Technology programme. There are all kinds of unexpected situations that arise. It’s a labour of love. I want to help our first cohort of students get to the finish line while continuously improving the programme. And I want to help the PhD researchers produce cutting-edge research and further the horizons of science.”
And how do you spend your free time?
“I spend my free time with my family, including my two toddlers. We have a very multilingual household and watching the kiddos learn to negotiate use of different languages, gestures, melodies, and interaction styles is a marvel that continuously reminds me how much farther we have to go in the voice tech field.”