Title Thumbnail

Computational Molecular Bioscience: Concepts

U. S. Raghavender

9781773612423
300 pages
Arcler Education Inc
Overview
This preface provides information that I would recommend for someone read¬ing and using the book. This book is written keeping in mind the undergradu¬ates and graduates studying life sciences in universities. This could serve as a supplement for people involved in research. With the flood of genomics and proteomics data, it is imperative that students in biology have a basic understanding of the data in biological repositories. Academic institutions and focussed organizations in genomics and proteomic research, in today’s world, share both raw and processed data in dedicated serv¬ers. These datasets are accessible, for free, to anybody on this planet. Under-standing how these datasets were produced requires one to study the research which prompted their existence (for example PDB or UniProt). This compre¬hension can result in a greater appreciation of the resource. We have provided as much information as is needed to jumpstart a curious reader’s comprehension of the actual data from the repositories using computers. The information processing done in biology using computation (algorithms) constitutes what is known as Computational Biology. The algorithms could be graded from simple to complex, based on the assumptions employed. In this book, we present basic concepts in computational molecular biology in the form of simple capsules. What I mean by that is the reader, with a laptop or a computer, can immediately start executing the code in the book to understand the concepts at a deeper level. As the focus is not on algorithms, we have kept the language quite simple. The places which demand detailed explanation have been dealt with accordingly. Otherwise, the presentation has been kept short. Basically, we start from sequences, then go to structures and try to find an evo¬lutionary connection between sequences and structures.1. We start with an Introduction to biology and genes and proteins. 2. Following which, we introduce the basics of Python. This is a must for any curious student starting in computational biology.3. In the third chapter, we introduce sequences in biology and the com¬monly used formats (e.g., FASTA). We also introduce different ways of comparing sequences. BLAST and EMBOSS suites are introduced.4. R, the statistical language, is introduced. This is of utmost importance, given the fact that any biological observation and experimental results seek statistical significance. The basics are presented along with the processing of FASTA files.5. The next chapter is on sequence statistics. With the computational background set for statistical analysis, we embark on common se¬quence statistics. 6. Researchers work with a large number of sequences. Hence, comparing them through multiple sequence alignment and the different approaches are presented in the next chapter. 7. With the flood of genomics, the set of sequences produced by large and computational annotation of gene function(s) becomes imperative. We introduce the KEGG database and present recipes for accessing the data from the resource using computation. 8. A large number of DNA sequences, upon translation, do fold into 3D structures. We introduce the reader to the basics of structural biology. We also show how one can handle structural data using computational techniques. 9. Lastly, we introduce evolution. Wherein, we try to connect sequences with structures. 10. Two appendices are provided. These are meant for the reader to ex¬plore interesting topics in programming like regular expressions and data analysis using pandas. But, these also provide not only a quick introduction but also a step forward to further explore the concepts us¬ing these topics.Every chapter begins with a small introduction followed by explanation of con¬cepts and a simple example. Computational Code is given in bounded box, and in a font (Centaur) different from the rest of the text (Times New Roman). References are provided at the end of each chapter, as needed. Each chapter is written to be self contained. I would also suggest that the reader become famil¬iar with IPython or Jupyter notebooks and RStudio. Most of the code in this book has been tested on these notebooks and should be reproducible. Finally, I hope that this book will help readers use computers and or laptops productively in their studies in the field of computational biosciences.
Author Bio
Dr. U. S. Raghavender did his Masters in Physics, with specialization in Condensed Matter Physics, followed by his doctoral thesis in X-ray Crystallography of Designed molecules. This was followed by post-doctoral stints in India and abroad, in both academic institutions and non-academic institutions. He has been working on computational approaches of biomolecules, since the past 12 years. He has authored book chapters, published research in internationally acclaimed journals. He also serves as a peer reviewer of research articles published in both national (Journal of Biosciences) and international journals (BMC Bioinformatics, Nucleic Acids Research etc.). He has a long term goal of addressing computational aspects of regulatory mechanisms in cell.