May 18, 2024

Deep Neural Networks: Promising Models For Human Hearing

Researchers from MIT have conducted a groundbreaking study demonstrating the potential of deep neural networks as models of human hearing. These computational models, derived from machine learning, show promise in designing improved hearing aids, cochlear implants, and brain-machine interfaces. The study, which represents the largest investigation to date of deep neural networks trained to perform auditory tasks, reveals that these models generate internal representations that resemble those observed in the human brain during sound perception.

One key finding of the study is that training deep neural network models on auditory input that includes background noise produces activation patterns in the models that more closely mimic those of the human auditory cortex. This insight offers valuable guidance on optimizing the training process for this type of model.

This study is distinctive because it represents the most comprehensive comparison of deep neural network models to the auditory system to date. It indicates that models derived from machine learning are moving in the right direction and provides clues as to what makes them better approximations of the brain. According to Josh McDermott, the senior author of the study and an associate professor of brain and cognitive sciences at MIT, these models have the potential to revolutionize our understanding of the brain and open up new possibilities for technological advancements in the field.

The lead authors of the paper published in PLOS Biology are Greta Tuckute, an MIT graduate student, and Jenelle Feather, a Ph.D. candidate at MIT. Deep neural networks are computational models consisting of multiple layers of units that process information. These models can be trained on vast amounts of data to perform specific tasks and have gained widespread prominence in various applications. Neuroscientists are now exploring the possibility of using these models to gain insights into how the human brain carries out certain tasks.

Tuckute underscores the significance of this research by explaining that machine learning-based models can mediate behaviors on a scale that was not conceivable with previous types of models. Consequently, there is growing interest in whether the representations in these models capture brain activity. When a neural network is processing an audio input, such as a word or sound, it generates activation patterns. These model representations can be compared with fMRI brain scans of individuals listening to the same input.

The researchers first reported on the similarity between internal representations generated by a trained neural network and brain scans in 2018. Since then, these models have gained widespread use, prompting McDermott’s research group to conduct an extensive evaluation to determine if the ability to approximate neural representations observed in the human brain is a general characteristic of these models.

For the study, the researchers analyzed nine publicly available deep neural network models trained to perform auditory tasks and created 14 additional models based on two different architectures. Most of these models were trained to perform a single task, such as word recognition, speaker identification, environmental sound recognition, or musical genre identification. Two models were trained to perform multiple tasks.

The results showed that when presented with natural sounds used in human fMRI experiments, the internal representations of the deep neural network models displayed similarity to those generated by the human brain. The models that most closely resembled brain activity were those trained on multiple tasks, including auditory input with background noise.

The study also supports the concept of hierarchical organization in the human auditory cortex, where processing is divided into stages that support distinct computational functions. Similar to the 2018 study, the researchers found that representations generated in earlier stages of the model closely resembled those observed in the primary auditory cortex, while representations from later stages more closely resembled those observed in brain regions beyond the primary cortex.

Furthermore, the study revealed that models trained on different tasks replicated different aspects of auditory perception. For example, models trained on speech-related tasks strongly resembled areas of the brain specialized for speech processing.

The researchers plan to build upon these findings to develop models that can more accurately reproduce human brain responses. Such models, besides enhancing our understanding of brain organization, could be instrumental in the development of advanced hearing aids, cochlear implants, and brain-machine interfaces.

McDermott emphasizes that the ultimate goal of this field is to create a computer model capable of accurately predicting brain responses and behavior. Achieving this goal would have far-reaching implications for numerous fields of study and open up a wealth of possibilities for advancements in neuroscience and technology.

*Note:
1. Source: Coherent Market Insights, Public sources, Desk research
2. We have leveraged AI tools to mine information and compile it