5 Questions to Diego de Panis, coordinator of the Sequencing & Assembly Committee

Read the full interview with Diego de Panis below:

1. Can you briefly introduce yourself and how you became interested in reference genomes?

My name is Diego De Panis, I’m a biologist and I coordinate the sequencing and assembly committee (SAC) of the European Reference Genome Atlas. As part of my PhD I started working with High-throughput sequencing, generating genome assemblies and using them as reference for comparative genomics and transcriptomics. At the time the sequencing technology landscape was a little bit different - long reads from PacBio were only starting to be used, so assemblies were mainly short-read based and we would use the long read data to improve the assembly as far as possible. Then as a postdoc I continued working with reference Genome generation. At the time High-C method was starting to be used for scaffolding assemblies and Nanopore data was also beginning to be adopted. I continued being involved in similar projects since then. So you can say that genomes have played a very important role in all my projects since my PhD.

2. Can you describe what are the main activities of the ERGA SAC?

The sequencing assembly committee is also its community - there are not independent entities. Our activities mainly focus on community building, facilitating tools, standardising procedures, promoting discussions, supporting the scientific community, that kind of thing. For instance on the networking side, the activities are very connected to the goal of strengthening collaboration, reinforcing the network and expanding it and a very important part of this is knowledge sharing. We do this by promoting and providing places for the exchange of ideas, by presenting new methods and updates and also giving space for the community to show their work and to seek help or guidance.

SAC also generates resources that stay available to the whole community. We produce some bioinformatic workflows - ways of running all different necessary programs in different combinations to produce the genome assembly and perform quality control on it. We test these workflows, write guidelines and we directly provide support to the community. All these workflows and guidelines contribute to the goal of establishing standards. For reference genomes to become high quality some quality standards must be met and it is easier to meet these quality standards if you follow standardized protocols and workflows, like the ones we share. So we play a role in this critical gatekeeping point. Related to that, we developed a new reviewing method that is completely open and based on standardized reports that show all the important metrics and quality controls so that community reviewers can check if the genome assemblies meet all relevant quality and define if it’s the final product ready to be shared with the scientific community. All this work is designed and produced with a lot of input from the community and by the community. This is the work of the network in action: coordinating from the community for the community.

3. Can you tell us about the most interesting and the most challenging aspects of being a part of SAC?

It is super interesting to meet other researchers with similar interests and discuss current developments in the field of reference genomes. I also find it interesting to learn about the particular challenges that members are facing trying to generate a high quality genome assembly for a “weird” species or not so common genus. It is super rewarding to connect with people, transcending all sorts of boundaries - geographical, career, languages, resources - and meet all these different people and talk about applying scientific thinking to produce these amazing resources such as reference genomes. The challenging part is to move forward an agenda that is truly helpful and will make the community improve. It can be difficult to understand what people need, what will be helpful for them, or to identify when someone needs more space or time for a given matter to be properly addressed. It is also challenging to keep people engaged and avoid the zoom fatigue with so many online meetings.

4. What developments in the world of genome “sequencing and assembly” are you most excited about in the coming years?

All the developments related to ultra low input protocols for sequencing library preparation are exciting. These methods already exist but they are quickly improving. Also protocols related to tricky samples or not “ideal samples” - this is something essential for the goal of generating genomes for all the eukaryotic biodiversity. These are some exciting developments that will bring a lot of progress in the field. Developments related to Oxford Nanopore sequencing and the production of ultra long reads are also very exciting. I think this technology is super cool and it will bring a lot of new possibilities. I have a lot of expectations about what will be finally delivered because this could make “telomere-to-telomere” (T2T) genome assemblies more accessible for the community. Finally of course developments related to Artificial Intelligence in general. This technology is proving pretty disruptive in other areas and I think this will happen in our field soon.

5. What are the next steps for the ERGA SAC Committee?

There is a plan to open more space for discussions about sequencing, as we have been quite focused on the “assembly” part so far. So we should start having some dedicated meetings about this soon. I think that it could be useful for the community to have this dedicated space for exchanges between people across Europe. Other future steps are related to strengthening and expanding the sequencing and assembly Community by providing and promoting space for discussion, assisting the community and making useful resources available for all.

Send an email to the Sequencing & Assembly Committee and learn more about how you can participate!