Kengo Sato is a professor at Tokyo Denki University School of System Design and Technology, Japan. He received his Ph.D. in Computer Science from Keio University, Japan, in 2003. He was awarded the Oxford Journals JSBi Prize in 2008 and the IPSJ Yamashita SIG Research Award in 2012. His research interests include bioinformatics, machine learning and life sciences. He develops innovative algorithms for sequence analysis and RNA informatics.
PhD in Computer Science, 2003
MEng in Computer Science, 1997
BSc in Mathematics, 1995
This chapter introduces the RNA secondary structure prediction based on the nearest neighbor energy model, which is one of the most popular architectures of modeling RNA secondary structure without pseudoknots. We discuss the parameterization and the parameter determination by experimental and machine learning-based approaches as well as an integrated approach that compensates each other’s shortcomings. Then, folding algorithms for the minimum free energy and the maximum expected accuracy using the dynamic programming technique are introduced. Finally, we compare the prediction accuracy of the method described so far with benchmark datasets.
RNA structural elements called pseudoknots are involved in various biological phenomena including ribosomal frameshifts. Because it is infeasible to construct an efficiently computable secondary structure model including pseudoknots, secondary structure prediction methods considering pseudoknots are not yet widely available. We developed IPknot, which uses heuristics to speed up computations, but it has remained difficult to apply it to long sequences, such as messenger RNA and viral RNA, because it requires cubic computational time with respect to sequence length and has threshold parameters that need to be manually adjusted. Here, we propose an improvement of IPknot that enables calculation in linear time by employing the LinearPartition model and automatically selects the optimal threshold parameters based on the pseudo-expected accuracy. In addition, IPknot showed favorable prediction accuracy across a wide range of conditions in our exhaustive benchmarking, not only for single sequences but also for multiple alignments.
Accurate predictions of RNA secondary structures can help uncover the roles of functional non-coding RNAs. Although machine learning-based models have achieved high performance in terms of prediction accuracy, overfitting is a common risk for such highly parameterized models. Here we show that overfitting can be minimized when RNA folding scores learnt using a deep neural network are integrated together with Turner’s nearest-neighbor free energy parameters. Training the model with thermodynamic regularization ensures that folding scores and the calculated free energy are as close as possible. In computational experiments designed for newly discovered non-coding RNAs, our algorithm (MXfold2) achieves the most robust and accurate predictions of RNA secondary structures without sacrificing computational efficiency compared to several other algorithms. The results suggest that integrating thermodynamic information could help improve the robustness of deep learning-based predictions of RNA secondary structure.