Bioinformatics – The future of Genovix®
At Agronomix Software we do not stop thinking about what the future of Genovix might be. By talking to our clients, we discover what they are doing new or different from last year. We consider the challenges of developing something new and how we can rise to that challenge. We are constantly looking at the future of the plant breeding industry to make adjustments to the development of Genovix. Bioinformatics is the next big change coming to this industry.
Single Nucleotide Polymorphisms (or SNPs for short!)
In 2004 I was an Applied Genetics student. It was here where I was first introduced to the DNA fingerprinting method Restriction Fragment Length Polymorphism (RFLP) analysis. I eagerly attended a seminar about that “new” single nucleotide polymorphism (SNP) genotyping method used to sequence the Human genome. Then I worked in a biodiversity project that was studying wild alpine plants. We were using the then more advanced Amplified Fragment Length Polymorphisms (AFLP) for DNA fingerprinting. We determined the ploidy level of some cryptic species using chromosome counting and flow cytometry, leading to discovery of new polyploid species.
Molecular breeding is here! And Agronomix are already at work!
Masters in the UK
When I went to the UK for my Masters study, I used sanger sequencing to determine the SNP within a small portion of the Yam genome that was amplified using specific primers to compare the different cultivars. The latest transcriptome sequencing method at that time was Affymetrix genechips at The Nottingham Arabidopsis Stock Centre (NASC). I had a course project where I analyzed Arabidopsis transcriptome using R/Bioconductor and commercial software called Partek. Through that course, I was introduced to the theory of the, then emerging, Next Generation Sequencing (NGS) techniques. In the meantime, the whole genome sequence of Brassica napus was underway using this new technology.
PhD in Germany
When I joined a lab in Germany for PhD in 2013, I was introduced to a flood of data: a draft sequence of the single B. napus cultivar ‘Darmor-bzh’, SNP genotyping of hundreds of cultivars using 60K Illumina SNP Genotyping Array. In addition, NGS RNA-seq data and multilocation and multiyear phenotypic data and climate data added the complexity of the dataset. I had to start with annotating the draft sequence of B. napus using gene homology to A. thaliana, B. rapa and other organisms.
Then was the next computation task to map sequences of different cultivars to the one draft reference sequence available at that time. The NGS sequences from Illumina by themselves needed special computation software. They came out not as the traditional FASTA format but as FASTQ, with added layer of read quality. The researcher had the responsibility to determine and fish out the low-quality reads. Spending time figuring out which software was most suitable for each step of analysis took time away from the actual research but was a great experience in the world of Bioinformatics.
PostDoc in Canada
I was relieved when my NGS sequence for my PostDoc project in Canada came back as a FASTA format. This eliminated the huge storage space, computation capacity and time needed to process the FASTQ format. By 2018, sequencing companies have taken more responsibility to deliver high quality reads to researchers. Several cultivars of major crops have been sequenced as reference. Sequencing has become cheaper and big data storage and computation facilities are now readily available. This is the best time for Agronomix software to embark upon a Bioinformatics module. This is a continuation of our support of plant breeders with the future of molecular breeding.