Learning to fight infection

Researchers at the Institute of Industrial Science at The University of Tokyo introduce a machine learning method to help predict past infection from receptor sequences of immune T-cells even when little data is available, which may help improve human health and our understanding of adaptive immunity
Credit: Institute of Industrial Science, The University of Tokyo

Scientific advancements have often been held back by the need for high volumes of data, which can be costly, time-consuming, and sometimes difficult to collect. But there may be a solution to this problem when investigating how our bodies fight illness: a new machine learning method called “MotifBoost.” This approach can help interpret data from T-cell receptors (TCRs) in identifying past infections to specific pathogens. By focusing on a collection of short sequences of amino acids in the TCRs, a research team achieved more accurate results with smaller datasets. This work may shed light on the way the human immune system recognizes germs, which may lead to improved health outcomes.

The recent pandemic has highlighted the vital importance of the human body’s ability to fight back against novel threats. The adaptive immune system uses specialized cells, including T-cells, which prepare an array of diverse receptors that can recognize antigens specific to invading germs even before they arrive for the first time. Therefore, the diversity of the receptors is an important topic of investigation. However, the correspondence between receptors and the antigens they recognize is often difficult to determine experimentally, and current computational methods often fail if not provided with enough data.

Now, scientists from the Institute of Industrial Science at The University of Tokyo have developed a new machine learning method that can predict the infection of a donor based on limited data of TCRs. “MotifBoost” focuses on very short segments, called k-mers, in each receptor. Although the protein motifs considered by scientists are usually much longer, the team found that extracting the frequency of each combination of three consecutive amino acids was highly effective. “Our machine learning methods trained on small-scale datasets can supplement conventional classification methods which only work on very large datasets,” first author Yotaro Katayama says. MotifBoost was inspired by the fact that different people usually produce similar TCRs when exposed to the same pathogen.

First, the researchers employed an unsupervised learning approach, in which donors were automatically sorted based on patterns found in the data, and showed that donors formed distinct clusters using the k-mer distribution based on having previous infection by cytomegalovirus (CMV) or not. Because unsupervised learning algorithms do not have information about which donors had been infected with CMV, this result indicated that the k-mer information is effective in capturing characteristics of a patient’s immune status. Then, the scientists used the k-mer distribution data for a supervised learning task, in which the algorithm was given the TCR data of each donor, along with labels for which donors were infected with a specific disease. The algorithm was then trained to predict the label for unseen samples, and the prediction performance was tested for CMV and HIV.

“We found that existing machine learning methods can suffer from learning instability and reduced accuracy when the number of samples drops below a certain critical size. In contrast, MotifBoost performed just as well on the large dataset, and still provided a good result on the small dataset,” says senior author Tetsuya J. Kobayashi. This research may lead to new tests for viral exposure and immune status based on T-cell composition.

This research is published in Frontiers in Immunology as “Comparative study of repertoire classification methods reveals data efficiency of k-mer feature extraction” (DOI: 10.3389/fimmu.2022.797640).

About Institute of Industrial Science, The University of Tokyo

The Institute of Industrial Science, The University of Tokyo (UTokyo-IIS) is one of the largest university-attached research institutes in Japan. Over 120 research laboratories, each headed by a faculty member, comprise UTokyo-IIS, which has more than 1,200 members (approximately 400 staff and 800 students) actively engaged in education and research. Its activities cover almost all areas of engineering. Since its foundation in 1949, UTokyo-IIS has worked to bridge the huge gaps that exist between academic disciplines and real-world applications.

Journal: Frontiers in Immunology
DOI. 10.3389/fimmu.2022.797640
Article Title: Comparative study of repertoire classification methods reveals data efficiency of k-mer feature extraction
Article Publication Date: 20-Jul-2022

Media Contact

Tetsuya J. Kobayashi
Institute of Industrial Science, The University of Tokyo
tetsuya@sat.t.u-tokyo.ac.jp
Office: 81-354-526-798

Media Contact

Tetsuya J. Kobayashi
Institute of Industrial Science, The University of Tokyo

All latest news from the category: Medical Engineering

The development of medical equipment, products and technical procedures is characterized by high research and development costs in a variety of fields related to the study of human medicine.

innovations-report provides informative and stimulating reports and articles on topics ranging from imaging processes, cell and tissue techniques, optical techniques, implants, orthopedic aids, clinical and medical office equipment, dialysis systems and x-ray/radiation monitoring devices to endoscopy, ultrasound, surgical techniques, and dental materials.

Back to home

Comments (0)

Write a comment

Newest articles

Innovative 3D printed scaffolds offer new hope for bone healing

Researchers at the Institute for Bioengineering of Catalonia have developed novel 3D printed PLA-CaP scaffolds that promote blood vessel formation, ensuring better healing and regeneration of bone tissue. Bone is…

The surprising role of gut infection in Alzheimer’s disease

ASU- and Banner Alzheimer’s Institute-led study implicates link between a common virus and the disease, which travels from the gut to the brain and may be a target for antiviral…

Molecular gardening: New enzymes discovered for protein modification pruning

How deubiquitinases USP53 and USP54 cleave long polyubiquitin chains and how the former is linked to liver disease in children. Deubiquitinases (DUBs) are enzymes used by cells to trim protein…