top of page

Search Results

27 items found for ""

  • ERGA Community genomes (beta) | ERGA

    Are you embarking on a reference genome project? Do you want to learn about the steps required for success? Then join the growing family of ERGA Community Genomes! ERGA aims to coordinate the production of high-quality annotated genome assemblies that represent eukaryotic biodiversity in Europe. A key part of this is building capacity across European researchers and institutes by supporting the growing community of scientists in biodiversity genomics through the provision of guidelines, workflows, and best practices that explain and greatly facilitate the successful execution of the many steps required along the complex workflow for reference genome generation. The guidelines below cover many of the main steps along the genome generation workflow, providing step-by-step advice and answers to frequently asked questions to help researchers navigate the complexities and find out where to turn for additional assistance: Menu 1. Pre sampling 2. Sample Acquisition Strategy 3. DNA/RNA extraction 4. Libraries preparation 5. DNA sequencing data 6. RNA sequencing data 7. Assembly completed 8. Annotation completed 9. Downstream analysis PLEASE NOTE: This is a work in progress . The initial beta version of these guidelines has been developed with input from the ERGA Committee Coordinators and its continued development and further elaboration are still ongoing and will include all ERGA Ccommittees. Contributors: Tom Brown, Diego de Panis, Joao Pimenta, Christian de Guttry, Ann Mc Cartney, Rita Monteiro, Javier Palma, Luisa Marins, Astrid Böhne, Robert Waterhouse, Camila Mazzoni. 1. Pre sampling Letter of Support Do you wish to indicate in your grant proposal that you are knowledgeable about where and how to find support for the genome generation pipeline? Considering the difficulty of obtaining funding for research in areas where you have no prior experience, your application can be supported by a letter of support from our chairs. If you would like to have this type of assistance, please indicate so on this form. Your grant's genomic section I. Are you in need of assistance in preparing a complete, convincing, and coherent grant application with realistic budget estimates for your project? It can be challenging to prepare your first grant in this area of research, and ERGA provides its hub-of-knowledge to assist you in your first journey into the world of reference genomes. An online meeting can be arranged where you can benefit from the experience of researchers who have already passed through this process several times. II. The grant has already been written but you are unsure of the content of the reference genome generation section? The expertise of the ERGA's Committees can assist you in this endeavour. Upon request, we will conduct a brief review of the genomic section of your grant proposal to help ensure it is of a high standard. Check the current status of the reference genome for the species you wish to sequence Is another research group already producing the reference genome for your species? On one of the following portals, you can check to see if anyone has already produced the reference genome for your species: ENA : The European Nucleotide Archive (ENA) operates as a public archive for nucleotide sequence data. By bringing together databases for raw sequence data, assembly information and functional annotation, the ENA provides a comprehensive and integrated resource for this fundamental source of biological information. ERGA data portal: This portal allows you to see species for which high-quality reference genomes are already being or have already been produced by ERGA-Affiliated projects. GoAT : Genomes on a Tree presents genome-relevant metadata for all Eukaryotic taxa across the tree of life. Metadata in GoaT include, genome assembly attributes, genome sizes, C values, and chromosome numbers from multiple sources. GoaT also collects information from various BioGenome projects about the species they plan to sequence and/or have already started sequencing. Other BioGenome consortia: Darwin Tree of Life (DToL) , Catalan Initiative for the Earth BioGenome Project (CBP) , The Earth BioGenome Project (EBP) , the Vertebrate Genomes Project (VGP) . 1. Pre sampling 2. Sample Acquisition Strategy 2. Sample acquisition strategy You are planning your field work and would like to know where to start? Here are some key considerations to help you get started: a. Permits Sampling collections should comply with the local and EU regulations. Make sure to have all the required permissions authorising collecting, exporting and sequencing species with open access data deposition before the collection of the species. i. ABS - Nagoya : The Nagoya Protocol is an international agreement that was adopted in 2010 as a supplementary agreement to the Convention on Biological Diversity. The Nagoya Protocol sets out rules and procedures regarding the access to genetic resources, and the sharing of the benefits derived from their usage. It also provides guidance on how to ensure the fair and equitable sharing of the benefits arising from the utilisation of traditional knowledge associated with genetic resources. The Protocol has been ratified by over 100 countries, and has been widely praised as an effective tool for promoting the conservation and sustainable use of biological diversity. To proceed, researchers should first verify whether their country has signed the relevant agreement, and subsequently, they should reach out to the designated focal point. Researchers should outline their intentions to conduct genome sequencing and subsequent release, while also requesting the necessary permit. ii. Sample collection permit : It grants permission to the holder to collect wildlife samples, with the understanding that all samples are used for scientific research and educational purposes only. The holder of this permit must abide by all applicable international, national, and local legislation in the collection of wildlife samples. No sample may be collected without prior approval from the relevant authority. Some general rules, with more detailed information presented in the ERGA Sampling Code of Conduct: 1.Ensuring that all samples collected are properly labelled and documented. 2.Providing adequate protection of the samples while in transport. 3.Maintaining records of all samples collected. 4.Obtaining authorization to transport the samples to the appropriate research facility. 5.The holder of this permit is responsible for disposing of all unwanted samples in a manner that is respectful to the environment and in compliance with all applicable laws. 6.The holder of this permit must provide a copy of the collected samples to the appropriate authorities upon request. 7.The undersigned agrees to abide by the conditions of this permit. b. Traditional Knowledge and Biocultural labels It is important to determine whether an indigenous population or a local community is involved in the project or whether the species is of special concern to them. In that case a label should be requested. The Labels allow communities to express local and specific conditions for sharing and engaging in future research and relationships in ways that are consistent with already existing community rules, governance and protocols for using, sharing and circulating knowledge and data. The primary objectives are to enhance and legitimise locally based decision-making and Indigenous governance frameworks for determining ownership, access, and culturally appropriate conditions for sharing historical, contemporary, and future collections of cultural heritage and Indigenous data. For more information check the ‘Local Contexts’ website. c. Sample Collection What sampling procedures do you follow on the field? For the generation of reference genomes, the perfect method is liquid nitrogen. There are many organic materials that can be stored in liquid nitrogen, including cells, tissues samples and entire individuals. As liquid nitrogen rapidly freezes samples, it provides researchers with the capability to store samples for long periods of time and minimises their DNA/RNA degradation. It is fundamental to collect samples and process them following the requirements specified by the sequencing facility. For species that can be maintained alive, they can be transported to a lab for processing and fast freezing the material immediately after dissection to prevent DNA and RNA degradation. The sample should be dissected on top of a plate on ice, to keep the sample cold, and fast frozen in liquid nitrogen. d. Taxonomic Validation Did an expert taxonomist confirm the identity of the collected species? Taxonomic validation is a complex and important process that is necessary for accurately classifying organisms by their physical and genetic traits. Reference genomes have already been created, but eventually the species was not what was targeted. Whenever possible, we recommend to DNA barcode the sample to prevent this from occurring. e. Vouchering A voucher specimen consists of a representative sample of the collected species. A voucher preserves as much as possible of the physical remains of an organism, serving as a verifiable and permanent record of wildlife. The sample is typically collected in the field and preserved in a herbarium or museum collection. Separate specimen voucher and take scaled pictures, following the requirements from the respective collection facility. (link of some facilities as an example). In addition, it should be noted that e-vouchers, which involve digital documentation and images, are also permissible in certain cases. f. Biobanking Biobanking refers to the storage of biological samples for research purposes. Animal/plant tissue biobanking is used to track genetic changes over time, which can help understand the evolution of species. Material for biobank should be deposited in biobank repositories. In addition to tissue biobanking, DNA biobanking is also possible. Ideally tissue and DNA are from the same specimen that will be sequenced, but for very small specimens a different individual can be used. The material should be preferably deposited in a repository in the same country of origin of the material. If national infrastructure is not available – or in addition to this, the LIB Biobank at Museum Koenig, Bonn, can centrally store any ERGA project samples. For contact information and sample requirements, please see LIB Biobank deposition guidelines . g. Storage Samples must be kept as cold as possible to prevent DNA degradation prior to sequencing. If possible, place the sample tubes into dry ice, a charged LN2 Dry Shipper (< -150ºC ) or a -80ºC freezer. Please note that wet ice and -20°C freezers are not appropriate for the storage of tubes containing samples intended for genome sequencing. h. Material Transfer Agreements MTAs are agreements between two parties, typically a provider and a recipient, that govern the transfer of biological samples. MTAs are used to ensure that the provider of the material is adequately compensated for the use of the material, and that the recipient of the material is legally and ethically responsible for its use. Sample providers should be aware of any MTA, for example when sending biological material between their research facility and sequencing centres/biobanks. Please check the requirement with your sequencing centre and biobank. More information can be found in the CETAF Code of Conduct and Best Practices (Example MTA without change in ownership ). i. Shipping All samples must be shipped on dry ice or in a dry shipper. Please make sure that they refill at borders/often. Be careful on the regulation of non-EU countries in Europe. j. ERGA manifest Do you wish to learn what metadata you need to submit with your sample in order to register it with ERGA as an ERGA Community genome? This is the ERGA sample manifest. Fields marked in bold are the mandatory variables. k. ENA mandatory fields: The European Nucleotide Archive (ENA) operates as a public archive for nucleotide sequence data. This is the ENA checklist of minimum requirements to register a physical sample. 3. DNA/RNA extraction Did you acquire the samples and you are ready to extract DNA and RNA? Ideally, high molecular weight (HMW) DNA and RNA should not be shipped, but extracted on site or handled very carefully prior to delivery. DNA a. DNA extraction protocols: DNA extraction is the process of isolating DNA from cells, tissues or other biological samples. b. High Molecular Weight DNA extraction protocols: Please see in the following section Libraries Preparation for a list of recommended protocols for extracting and preparing HMW DNA for sequencing. RNA extraction protocols RNA extraction involves separating ribonucleic acid (RNA) from a cell or a tissue sample. DNA concentration, integrity, and purity i. DNA concentration: is typically measured in nanograms per microliter and can be determined using techniques such as Qubit assays. ii. DNA integrity: is a measure of the quality of the DNA. DNA integrity can be determined using gel electrophoresis or PCR-based methods. It is important to ensure that the DNA is intact and not degraded, as this can affect the accuracy of results. iii. DNA purity: is a measure of the level of impurities in DNA samples. It is important to ensure that DNA is free from contaminants, as this can affect the accuracy of results. DNA purity can be assessed using spectrophotometry based methods as NanoDrop. 3. DNA/RNA extraction 4. Library preparation You’ve extracted your DNA and are wondering how to go about getting the required DNA/RNA to assemble or annotate your genome? DNA library preparation is a key step in the process of sequencing. The library preparation will determine the quality of your assembly and annotation. Ensuring that the DNA is processed properly in order for accurate and reliable results to be obtained. Here you can find our recommended protocols library preparation such as PacBio, Oxford Nanopore Instruments, Chromatin Conformation Capture (HiC) sequencing and whole-transcript sequencing, among others. PacBio HiFi Typically made up of DNA fragments around 10-15kb in size and with an accuracy of over 99%, PacBio HiFi reads are constructed by circularising DNA and creating a Circular Consensus Sequence (CCS) with high accuracy. This protocol has a history of producing high-quality reference de-novo genomes for a wide range of species and genomes. ONT Oxford Nanopore Technologies offers an alternative to read long pieces of DNA via electrical fluctuations caused by the nucleotides passing through a membrane pore. The reads sequenced here can be much longer than with PacBio HiFi (typically over 30kb, but ultra-long libraries are established to sequence reads of over 200kb in length) but come with a higher error rate. As the hardware and base-calling software have improved over time, the error rates have reduced from over 15% to almost 1% in modal error rate. Hi-C Arima or Dovetail genomics 3-dimensional Chromatin Conformation Capture libraries allow us to gain insight into the organisation of the genome into Topologically Associated Domains (TADs), Eu- and hetero-chromatin and chromosomes. In the generation of a reference genome, we leverage the information that regions close together in the linear are more likely to be close together in 3D space to order and orient our smaller assembled sequences (contigs and scaffolds) into chromosomes. HiC protocols generally follow the steps of isolating nuclei, cross-linking chromatin in its 3D conformation, digesting the DNA at either enzyme motif sites (Arima) or DNAse-exposed areas of the genome (Dovetail) and then sequencing the two cross-linked regions via paired-end sequencing on an Illumina device. Illumina shotgun sequencing Useful for error-correction of the final assembly, or identifying sequences from parental lines when performing a trio-binned assembly, Whole Genome Sequencing (WGS or Shotgun Sequencing) aims to sequence the entire genome in short fragments (typically 100/150bp paired-end libraries) with high accuracy (Q30 or 99.9% accuracy). RNA-seq Recommend to help the annotation process of creating your reference genome. Sequencing of RNA-seq libraries is typically performed on an Illumina instrument after RNA has been extracted from your tissues of interest (usually brain or gonad for genome annotation), converted to cDNA and finally amplified before loading onto an instrument. Iso-seq The PacBio Iso-seq protocol offers full-length sequencing of transcripts, which is particularly powerful when annotating alternate isoforms in the genome. The sequencing is performed on a PacBio instrument and again leverages the repeated sequencing of circular cDNA to create a high-accuracy consensus sequence for each transcript. 4. Libraries preparation 5. DNA sequencing data 5. DNA Sequencing Data You’ve finished the DNA sequencing for your genome and want some guidance with your assembly to ensure you meet ERGA quality standards? The Sequencing and Assembly Committee will prepare a number of workflows that you can download and run to assemble your genome. I’m having trouble with my assembly Assembling a partially quintaploid, highly-repetitive, AT-rich genome? The Sequencing & Assembly Committee (SAC) would love to hear about your genome and can advise on what to do next. Contact assembly@erga-biodiversity.eu to arrange a presentation at the fortnightly committee meeting to get some feedback from our members. 6. RNA sequencing data - “An assembly is nothing without an annotation” After you have produced a reference-quality genome assembly, you should think about annotating the key features of your genome. This includes, but is not limited to, finding and recording the locations of: Repeat sequences; Transposable Elements; Telomeres and Centromeres; Protein-coding sequences; Micro-transcript sequences (miRNA); Non-coding sequences (ncRNA). The Annotation committee has prepared a number of workflows that you can download and run to assemble your genome. The Annotation Committee can guide you with some of these steps, or for ERGA Community genomes, we also recommend uploading your genome to ENA, where ENSEMBL can annotate your genome using publicly-available transcript data. 6. RNA sequencing data 7. Assembly completed You have produced a genome assembly and want to associate it with ERGA as a Community genome? Here we detail the next steps required to obtain the ERGA label for your genome and some recommendations for what to do next as part of our best practices: How do I know if my assembly is good enough? First, your assembly should meet the EBP metrics , the Sequencing and Assembly Committee will be able to guide you through the post-assembly QC process. Either submit an EAR or present your genome at a SAC meeting. Open-access genomes for all If you have a high-quality genome and want to associate it with ERGA, it needs to be of EBP quality and in the public domain. We recommend uploading your genome to ENA and then contacting the SAC . Once your genome has the “Seal of Approval”, we will link your publicly available genome to the ERGA Community Genomes BioProject. 7. Assembly completed 8. Annotation completed 8. Annotation completed You have an assembly and annotation that you wish to associate with ERGA as a Community genome? Here we detail the next steps required to obtain the ERGA label for your genome and some recommended next steps: How do I know if my assembly and annotation are good enough? First, your assembly should meet the EBP metrics, the ERGA annotation committee will be able to guide you through the post-assembly QC process. Either submit an EAR or present your genome at a SAC meeting. Your annotation should be in a format that can be downloaded and used by all (e.g. gff3) and linked to your assembly. How do I get the ERGA label? You need to upload your assembly, annotation and all sequenced data to ENA in order to be associated with the ERGA BioProject. Once your genome and data are available, contact the SAC to get the “Seal of Approval” and have your genome linked to the ERGA BioProject. If you wish to make use of the Ensembl rapid annotation, all associated transcript sequencing data also needs to be published on ENA. What next? Now you have a high-quality genome, there is a host of analysis that can be performed including Population Genomics, Phylogenomics, Comparative Genomics & Functional Genomics. The Data Analysis Committee have produced a guide on how to conduct a variety of Downstream Analyses. 9. Downstream analysis You have an ERGA reference genome and you would like to analyse the data? Here we suggest the next steps required to plan your downstream analysis within the highest scientific standards, suggesting recommended frameworks and pipelines to tackle your research questions by applying your reference genome. High-quality reference genomes are an essential tool to detect genic and intergenic regions and identify genetic variants (e.g. SNPs, CNV’s, and structural variants), which are crucial to understand processes in the different fields of genomic research. The Data Analysis committee (DAC) can provide additional help through its subcommittees devoted to the different fields of genomic research: Population Genomics, Phylogenomics, Comparative Genomics & Functional Genomics. You can contact the subcommittee relevant for your research question and meet with several experts in the field. You can also take the opportunity to present your research to the ERGA community and get relevant feedback to develop your research. DAC also offers opportunities for training through its conferences and workshops organised in collaboration with the Training and Knowledge Transfer committee. DAC Subcommittees i. Population Genomics: this subcommittee encloses a group of researchers who specialize in studying the genetic variation and evolutionary processes within populations. This field combines the principles of genetics, genomics, and population biology to understand how genetic diversity arises, spreads, and changes over time. The main objective of this group is to support the investigation of the genetic factors influencing the composition and dynamics of populations and species. Through collaborative efforts and interdisciplinary approaches, the subcommittee intends to contribute to the broader field of genomics and its applications in various areas of biodiversity ii. Phylogenomics: this subcommittee encloses a group of researchers who are focused on studying evolutionary relationships and the diversification of organisms using genomic data. The subcommittee's main purpose is to support the development of research on accurate and robust reconstruction of phylogenetic trees or evolutionary histories using genomic information. Through the collaboration with research teams, the subcommittee intends to provide valuable insights into the tree of life and clarify the evolutionary history of European species. iii. Comparative Genomics: this subcommittee encloses a group of researchers devoted on studying and comparing the genomes of different organisms to gain insights into their evolutionary relationships, genetic variations, and functional elements. Comparative genomics combines genomics, bioinformatics, and evolutionary biology to explore the similarities and differences in the genetic makeup of various species. The subcommittee's main purpose is to support the development of analyses and interpretation of genomic data from multiple organisms, by identifying shared and unique genomic characteristics, to infer evolutionary relationships, gene function, and evolutionary processes. The subcommittee intends to promote the advancement of our understanding of the genomic landscape across species. By comparing and analyzing genomic data, research in this field will offer insights into the evolutionary history and functional elements of genomes, ultimately contributing to various aspects of biological research. iv. Functional Genomics: this subcommittee encloses a group of researchers devoted to understanding the functional elements and activities of genomes, clarifying the functions and interactions of genes, non-coding elements, and regulatory networks, as well as their roles in various biological processes and disease conditions. The main objective of this group is to support research on how genomic information is translated into functional outcomes, exploring the relationships between DNA sequences, gene expression patterns, protein production, and cellular processes. The subcommittee intends to provide insights into the functional aspects of genomes, gene functions, regulatory networks, and their impact on biological processes and disease conditions. 9. Downstream analysis

  • TKT - Training and Knowledge Transfer

    < Back TKT - Training and Knowledge Transfer training@erga-biodiversity.eu The Training and Knowledge Transfer (TKT) committee aims to support the design and implementation of learning and skill-sharing activities in the field of biodiversity genomics research. Our committee actively engages with all ERGA committees to collect, promote, and develop training materials, including webinars, workshops and activities, and make them available to the ERGA community. We work with the ERGA community to connect experts interested with members wishing to learn new skills and improve their knowledge of the chain of steps required for reference genome generation. We also connect and support members to collaboratively develop funding proposals for financing TKT-related activities, and we coordinate the programme for the monthly ERGA Plenary meetings. V.1.0 02.05.2023 Chair Alice Mouton Coordinator Christian de Guttry Steering Committee Lino Ometto Nadège Guiglielmon Robert Waterhouse Spiros Papakostas Jean-François Flot Press Releases Genetic adaptation of Northern chamois ecotypes to climate change and habitat loss Events What happened at the UN Biodiversity Conference (COP 16) on Digital Sequence Information? Can I still use open public genetic sequence data? ERGA Newsletter ERGA News #24 - November 2024 Events Evolutionary transcriptomics in brown algae

  • SUPPORT | ERGA

    ERGA Support Request

  • FAQs | erga

    What is ERGA? The European Reference Genome Atlas is a community of peers working to advance the generation of reference genomes for European Biodiversity. ERGA members share a passion for biodiversity and see reference genomes as key resources that can boost our understanding of biodiversity and inform conservation strategies. Our community is made up of researchers with very diverse expertise and backgrounds working in the European continent or interested in European biodiversity. ERGA also represents the European node of the global Earth BioGenome Project, which has the goal of coordinating the generation of reference genomes for all of Earth’s Biodiversity. What are ERGA’s main goals? ERGA’s Core Objectives are to: Create and consolidate a collaborative and interdisciplinary network of scientists across Europe and associated countries to deliver reference genome sequences; Connect relevant infrastructures across Europe following a distributed model for genome sequence generation and analysis that can increase dynamically; Develop guidelines and best practices for state-of-the-art reference genome sequence generation, and disseminate them through training and knowledge transfer; Connect BioGenome initiatives working on European species to each other and with ERGA’s own initiatives to maximise synergies. How can I get involved and contribute to ERGA? Firstly, please register as an ERGA member. Membership is free and will ensure you receive our monthly newsletter and information about upcoming events and meetings. Once you become a member, you will have easy access to ERGA meetings. Our monthly plenary meetings are a good starting point to get to know the community. If you are interested or need support with a specific step of the genome generation process, you might want to interact with or even join one of the open ERGA committees. Each committee has their own way of operating and a monthly meeting slot. If you want to participate in any of the committees just send an email to the committee’s address to be added to their communication channels and learn the best opportunities to contribute. If you have an ongoing genome project of any European eukaryotic species, you can associate it with ERGA as an ERGA Community Genome. Check this page for more information on this procedure. What are the benefits of joining ERGA? If you are a researcher working on biodiversity genomics, joining and following ERGA’s activities can bring many advantages, including: Taking an active role in the generation of high-quality reference genomes for biodiversity conservation; Networking - through our network you will be able to interact and collaborate with colleagues from all across Europe working on topics related to your research; Get support from the ERGA Committees - as a member, you have direct access to groups of specialists in all steps of the genome production workflow; Go beyond science - Besides producing reference genomes and connecting researchers, ERGA is also committed to reaching out beyond academia to disseminate the importance of biodiversity and the role of genomics; From theory to practice - Lead the application of genomics technologies to biodiversity research and conservation directly in the field. What is the policy of ERGA on data? Check our Open Data Policy. This covers key requirements and recommendations regarding the collection, processing, storage, and publishing of metadata and data related to the production of high-quality reference genomes. If you have questions or concerns about our data policy, please reach out to the IT & Infrastructure committee at itinfra@erga-biodiversity.eu. How can I connect with other members of ERGA in my country? To interact with the ERGA Community in your country, please contact your country’s Council representative through the email available here and ask about any local initiatives already in place and how to engage. If your country is not yet represented in the ERGA council, we are happy to welcome new countries and hope to have representation from all European countries! Please refer to the Governance Document for more details on how to join the ERGA Council as a representative of your country. How can I get in touch if I have other questions? You can reach out to ERGA through many channels. Here are some ways to get in touch with us: Email us at contact@erga-biodiversity.eu You can join the ERGA Keybase team and ask your question in one of the many channels (instructions for this are provided when you sign up to become a member) Social Media: You can also follow us on X @erga_biodiv (previously Twitter), ERGA LinkedIn and Mastodon.

  • Library | ERGA

    ERGA Library Filter by Category Select Category Publication A Faroese perspective on decoding life for sustainable use of nature and protection of biodiversity Year: 2024 DOI/URL: https://doi.org/10.1038/s44185-024-00068-0 Next Publication The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics Year: 2024 DOI/URL: https://doi.org/10.1038/s44185-024-00054-6 Next Publication Building a Portuguese coalition for biodiversity genomics Year: 2024 DOI/URL: https://doi.org/10.1038/s44185-024-00061-7 Next Publication Contextualising samples: supporting reference genomes of European biodiversity through sample and associated metadata collection Year: 2024 DOI/URL: https://doi.org/10.1038/s44185-024-00053-7 Next Publication First Chromosome-Level Genome Assembly of a Ribbon Worm from the Hoplonemertea Clade, Emplectonema gracile, and Its Structural Annotation Year: 2024 DOI/URL: https://doi.org/10.1093/gbe/evae127 (Funded by the Research Council of Norway project “InvertOmics—phylogeny and evolution of lophotrochozoan invertebrates based on genomic data” (project number: 300587 to T.H.S.) Next Publication The genome sequence of the Violet Carpenter Bee, Xylocopa violacea (Linnaeus, 1785): a hymenopteran species undergoing range expansion Year: 2024 DOI/URL: https://doi.org/10.1038/s41437-024-00720-2 Next Presentation Genomic Data Production Systems to Catalogue and Explore Eukaryotic Biodiversity Year: 2024 DOI/URL: https://doi.org/10.5281/zenodo.12200270 Next Poster Community-driven standards development for reference genome generation Year: 2024 DOI/URL: https://doi.org/10.7490/f1000research.1119761.1 Next Publication The genome of the rayed Mediterranean limpet Patella caerulea (Linnaeus, 1758) Year: 2024 DOI/URL: https://doi.org/10.1093/gbe/evae070 Next Publication The genome sequence of the Cretan wall lizard, Podarcis cretensis (Wettstein, 1952) Year: 2024 DOI/URL: https://doi.org/10.12688/wellcomeopenres.21176.1) Next Publication The genome sequence of the Montseny horsehair worm, Gordionus montsenyensis sp. nov., a key resource to investigate Ecdysozoa evolution Year: 2024 DOI/URL: https://doi.org/10.24072/pcjournal.381 Next Publication Scalable, accessible and reproducible reference genome assembly and evaluation in Galaxy Year: 2024 DOI/URL: https://doi.org/10.1038/s41587-023-02100-3 Next 1 2 1 ... 1 2 ... 2

  • Social Justice | ERGA

    Social Justice Committee Social Justice Committee Definition Generating high-quality eukaryotic reference genomes is transforming our understanding of biology and evolution. The process of developing this resource so that it has long-term utility is complex and intricate, requiring not only technical and scientific expertise but also the integration of social justice principles. These data will have a significant impact on society, making the incorporation of social justice principles essential. In the ERGA community research setting, social justice means treating everyone fairly and ensuring that individuals from all backgrounds can benefit from our research. Key components of this process include diversity in research participation, fair distribution of research benefits, adherence to high ethical standards, dissemination of research findings to a broad audience, and fostering an inclusive and supportive research environment. The ERGA Social Justice Committee is dedicated to embedding justice, equity, diversity, and inclusion principles into every aspect of eukaryotic genome production, from sampling to results dissemination. This committee serves as an ethical compass for ERGA members, guiding the community to ensure that every step in the genome generation pipeline is conducted with social responsibility and respect for diversity. We aim to ensure both scientific rigor and social responsibility in our guidelines for generating high-quality reference genomes by integrating these principles. Authors in alphabetical order Chiara Bortoluzzi, Christian de Guttry, James Fleming, Fabrizio Ghiselli, Jennifer Leonard, Rebekah Oomen Objectives Promoting Diversity In ERGA, diversity is multifaceted, encompassing the composition of research teams with individuals from diverse backgrounds and expertise, the variety of taxa sequenced and their geographical origins, as well as the involvement of stakeholders and citizen scientists. This approach ensures research methodologies and outcomes reflect nature's extensive diversity. Ensuring Equity Equity in ERGA is about providing equal access to resources and opportunities across all individuals, communities, countries, and research institutions. It particularly aims to include those historically underrepresented or marginalised. Transfer of Knowledge is integral to this effort, ensuring broad participation in this research. Advancing Inclusion Inclusion involves creating a research environment that values and welcomes the contributions of all, aiming to promote a setting in which every participant can thrive and deliver their maximum potential. To achieve this, we focus on enhanced communication tools, aiming to ensure everyone feels comfortable and supported. Upholding Justice ensuring genomic research processes are available to interested researchers. It also means recognizing and addressing traditional social inequalities affecting current research practices. Application in High-Quality Reference Genome Generation Diversity Sequencing: Allocate sequencing capacity to underrepresented taxa to broaden biodiversity knowledge. Collaboration: Establish diverse consortia to ensure broad geographic representation in genomic research initiatives. Promote gender equality in research teams and leadership positions within genomic projects. Outreach: Develop educational materials on genomics tailored to different academic backgrounds, ages, cultures, and languages. Equity Sample collection : Ensure equitable access to the benefits from genetic resources for source countries and communities, in line with the Access and Benefit Sharing framework (ABS) and the Nagoya Protocol. Ensuring, where possible, equitable access to the field for researchers with diverse needs. Wet lab: Partner with local labs in sample-origin countries to build capacity and share expertise. Sequencing : Offer training programs and protocol sharing in sequencing techniques for scientists from all backgrounds. Genome assembly: Provide open-source software and pipelines together with cloud-based computational resources for researchers who need access to bioinformatic support and computing power. Publishing : Encourage open access availability, either through open access publication or deposition of versions of papers in open access repositories. Technology transfer: Facilitate the transfer of cutting-edge genomic technologies to laboratories in low-income countries; Provide legal and technical assistance to navigate regulations. Inclusion Sample collection: Implement informed consent protocols that respect Indigenous and local communities' rights and traditions. Empower the efforts of local taxonomic experts alongside those communities throughout the sample collection process. Publishing: Adopt open-access policies for publishing results, making information freely available to the community as soon as possible. Data sharing: Facilitate and encourage the rapid sharing of data in global databases that are freely accessible and FAIR (Findable, Accessible, Interoperable, and Reusable), promoting data democracy. Capacity building: Establish mentorship programs connecting established scientists with emerging European-based researchers. In this way, we aim to encourage the development of a new generation of scientists with a representative and diverse mix of abilities, genders, ethnicities, cultural and economic backgrounds, and geographical origins. Community engagement: Acknowledging the diverse contributions made beyond those in academia - universities/ museums/ research institutions - particularly those from local communities and underrepresented groups at all steps of the process from sampling to genome generation and appropriately recognizing their participation. Outreach: Host public science events in biodiversity genomics in diverse geographical locations to spread awareness and foster interest. Ensuring scientific events are organised in a way that is inclusive and accessible, both physically and socially. Social Justice Relevance: Acknowledging that both research and its outcomes could disproportionately affect specific communities within Europe, we commit to responsibly using outreach, engagement, and communication channels to center local communities impacted by biodiversity loss and anthropogenic environmental change, which are directly addressed in the ERGA remit. Personal data sharing: Ensure that data-sharing practices respect the privacy and rights of individuals and communities represented in the data. Ethics: Regularly review external bodies' ethical guidelines to address emerging issues related to social justice in genomics and strive to position ERGA to be as inclusive as possible. Sustainability: Research and implement sustainable laboratory and computational practices to reduce waste and energy consumption; Evaluate the long-term environmental impact of genomic research activities and develop strategies to mitigate negative effects. Conclusion Achieving the broad goals of social justice, diversity, equity, and inclusion in genomic research presents significant challenges. One major hurdle is the intrinsic resource and infrastructure disparities across different European regions. This discrepancy limits access to advanced genomic technologies, computational resources, and skilled personnel. This widens the gap between well-funded institutions in Strengthening countries and less-funded ones in Widening countries. It is also imperative to note that the historical underrepresentation of some groups and species in genomic studies poses ethical and logistical challenges when redressing these imbalances. The complexities of integrating diverse biological samples, especially from Indigenous and marginalised communities, require sensitive, informed consent processes and benefit-sharing arrangements that respect both legal frameworks and ethical considerations. Furthermore, incorporating a wide range of species and their geographical origins into research necessitates a collaborative effort, which geopolitical, financial, and linguistic barriers can hamper. There are as many solutions to these challenges as there are issues themselves. Fostering international collaborations sharing resources, knowledge, and skills is a key strategy for building capacity in underrepresented regions and marginalised groups. Initiatives like cloud-based computational resources, open-source software, and open-access publishing models can democratise access to genomic research tools and findings. Furthermore, engaging local communities in the research process, from planning through to publication, ensures that projects are culturally sensitive and ethically sound, while also facilitating the equitable sharing of benefits. Education and outreach, tailored to diverse audiences, can raise awareness and foster a more inclusive next generation of genomic researchers. Ultimately, the path to achieving Social Justice in genomic research is ongoing and requires a commitment to continuous reflection, adaptation, and action towards these ideals.

  • ITIC - IT & Infrastructure Committee

    < Back ITIC - IT & Infrastructure Committee itinfra@erga-biodiversity.eu In the IT and Infrastructure committee, we aim to facilitate robust and reproducible science in all steps of creating an ERGA genome. The IT Committee oversees the use of various platforms to keep all other committees up to date, secure and compliant with EU laws and regulations. The IT Committee is central to ensuring that data and metadata are accessible to the public and in line with our Open Access Policies. Send us an email if you would like to be involved in our work. (V.1.0 22.05.2023) Coordinators Tom Brown Christian de Guttry Resources 🔗 How-to-guide: Submitting data to ENA Gustafsson, O.J.R., Wilkinson, S.R., Bacall, F., Pireddu, L., Soiland-Reyes, S., Leo, S., Owen, S., Juty, N., Fernández, J.M., Grüning, B. and Brown, T., 2024. WorkflowHub: a registry for computational workflows. arXiv preprint arXiv:2410.06941 . https://doi.org/10.48550/arXiv.2410.06941 Press Releases Genetic adaptation of Northern chamois ecotypes to climate change and habitat loss Events What happened at the UN Biodiversity Conference (COP 16) on Digital Sequence Information? Can I still use open public genetic sequence data? ERGA Newsletter ERGA News #24 - November 2024 Events Evolutionary transcriptomics in brown algae

  • ERGA-BGE | ERGA

    Biodiversity Genomics Europe (BGE) The Biodiversity Genomics Europe Project has the overriding aim of accelerating the use of genomic science to enhance understanding of biodiversity, monitor biodiversity change, and guide interventions to address its decline. The BGE Project comprises activities focused on DNA Barcoding (Barcoding Stream) and Reference Genome Generation (Genomes Stream) for eukaryotic species across Europe, bringing together two European networks: iBOL Europe and the European Reference Genome Atlas (ERGA). The ERGA Stream of BGE The Genomes Stream of BGE, as the European node of the Earth BioGenome Project (EBP) , aims to establish and implement large-scale biodiversity genomic data generation pipelines to accelerate the production and accessibility of reference-quality, complete genome sequences for species across the whole of European biodiversity. The output will support applications in the fields of: biodiversity characterisation, conservation, and biomonitoring. The Genomes Stream focuses on generating reference-quality genomes from critical European biodiversity, biodiversity hotspots, pollinators, and a selection of applied case studies. BGE-ERGA Stream Work Packages: BGE-ERGA News Genetic adaptation of Northern chamois ecotypes to climate change and habitat loss Mapping the genomic basis of common thyme aromatic diversity and its adaptive significance for ecotype formation and climate change adaptation ERGA News #23 - October 2024 Partner Institutions Leibniz Institute for Zoo and Wildlife Research University of Lausanne University of Florence Cibio Genomescope 6 7 8 600x600logos_hackathon_sponsors_logos2 bge_erga-inst_LOGOS(7) CSIC University of Oslo 14 16 bge_erga-inst_LOGOS(2) bge_erga-inst_LOGOS(3) Sanger Earlham Institute bge_erga-inst_LOGOS(5) bge_erga-inst_LOGOS(6) Discover the whole BGE network

  • Projects | ERGA

    Newsroom Project Name https://www.uef.fi/en/article/erga-is-mapping-the-dna-of-european-species-finland-represented-by-a-mountain-hare-genome

  • Glossary | ERGA

    Glossary This page provides explanations about terms and acronyms often used within ERGA and in the context of Biodiversity Genomics. You can filter the terms alphabetically or according to categories: Annotation Citizen Science Data Analysis ELSI Media & Communications Other Sampling & Sample Processing Sequencing & Assembly A B C D E F G H I J K L M O P R S T V W References > (Genome) annotation The process of identifying the functions of different pieces of a genome. This includes genes that code for proteins and non coding features (e.g. intron-exon structure of protein coding genes, promotors, transposable elements). Typically performed using computational methods, followed by manual curation. (Genome) assembly A genome assembly is a representation of an organism’s genome that is made using computer programs to turn (assemble) raw sequence data into longer, continuous sequences. (Genome) completeness An estimate of how well a reference genome represents the complete sequence of the target organism. A complete genome should equal the haploid genome size of the target, but may be defined when ‘all chromosomes are gapless and have no runs of 10 or more ambiguous bases, there are no unplaced or unlocalized scaffolds, and all expected chromosomes are present.’ (https://www.ncbi.nlm.nih.gov/assembly/). There are different approaches to estimate the completeness, like BUSCO, analysing K-mers, etc. ABS Access & Benefit Sharing BGE Biodiversity Genomics Europe. The BGE Project has received funding through a Horizon Europe call on Biodiversity and Ecosystem Services. The overarching BGE project includes two streams of genomic research: reference genomes and barcoding, in an effort to establish ERGA and BIOSCAN as the European nodes of the Earth Biogenome Project and of the International Barcode of Life (IBOL), respectively. BUSCO A bioinformatic method (Benchmarking Universal Single-Copy Orthologues) used to estimate the completeness of the coding fraction of an organism’s genome based on the proportion of (lineage specific) single copy orthologous genes that are found in a genome assembly. Biodiversity genomics The application of genomic methods to research biodiversity. CARE Principles The CARE principles for Indigenous data governance (https://www.gida-global.org/care) provide a governance framework that supports the recognition of rights and interests Indigenous Peoples’ to their physical and digital data as well as their Indigenous Knowledges. CBD Convention on Biological Diversity COPO The Collaborative OPen Omics (COPO) platform is for researchers to publish their research assets, providing metadata annotation and deposition capability. It allows researchers to describe their datasets according to community standards and broker the submission of such data to appropriate repositories whilst tracking the resulting accessions/identifiers. Learn more about COPO in this article by the Earlham Institute. CS Citizen Science Committee Chromosome-level assembly the process of generating a contiguous sequence of all chromosomes of a genome, often aided by genetic maps or proximity ligation techniques (3C-seq, Hi-C); term also used to refer to the resulting genome sequence. Council meetings During the monthly ERGA council meetings, the representatives of countries and other genome projects associated with ERGA meet to discuss and vote on important matters related to ERGA’s governance and actions. The council is the main decision making body of the consortium. Learn more about ERGA's structure in our Governance Document. DAC Data Analysis Committee DSI Digital Sequence Information - learn more: https://www.cbd.int/dsi-gr/ DToL The Darwin Tree of Life Project aims to sequence the genomes of 70,000 species of eukaryotic organisms in Britain and Ireland. EBP The Earth BioGenome Project EBP Genome assembly quality standard 6.C.Q40 Minimum reference standard of 6.C.Q40, i.e. megabase N50 contig continuity and chromosomal scale N50 scaffolding, with less than 1/10,000 error rate. For species with chromosome N50 smaller than a megabase this will be C.C.Q40. Additional recommendations include K-mer completeness >90%, BUSCO complete single-copy single >90%, BUSCO complete single duplicate < 5%, and Gaps/Gbp <1000. EC European Commission ELSI Ethical, Legal, and Social Issues (Committee) ENA The European Nucleotide Archive (https://www.ebi.ac.uk/ena) is a global repository for sequence data and provides resources that support management and access to sequence data. ERGA European Reference Genome Atlas ERGA Plenary Our plenary meetings are open to all registered ERGA members and generally include short updates given by committee chairs and one invited talk on various themes connected to biodiversity genomics (watch the previous ones here). ERGANews ERGA’s monthly newsletter, includes important updates about the consortium, each of the committees and associated projects. Our newsletters are usually published on the first Tuesday of each month. All editions of the newsletter are stored here. Equity Deserving According to the Canadian Council (https://canadacouncil.ca/glossary/equity-seeking-groups) equity deserving groups are those individual researchers, communities, Peoples, regions or countries that have identified barriers to equal access, opportunities, and resources due to disadvantage and/or discrimination and that are actively seeking, and deserving of social justice and reparation. The discrimination experienced could be caused by attitudinal, historic, social, and environmental barriers that could be based on a plethora of characteristics that are including (but not limited to) sex, age, ethnicity, disability, economic status, gender, gender expression, nationality, race, sexual orientation, and creed. FAIR Principles A set of principles to guide appropriate management and curation of scientific data (https://www.go-fair.org/fair-principles/) that emphasise data accessibility and use by ensuring that data are Findable, Accessible, Interoperable, and Reusable. Due to the increasing amount of scientific data being reposited, FAIR guidelines promote a data format that is amenable to automated computational access of data by stakeholders GoaT Genomes On A Tree HE Horizon Europe , sometimes refers to the BGE project funded under HE HSM Hierarchical Storage Management is both a data management and data storage technique which transparently manages the movement of data between the different layers of a tiered storage based on file size thresholds, usage and I/O pressure. Usually, a tiered storage is composed of one or more layers of disk arrays, ordered by capacity, latency, redundancy and storage cost. A slow but economically effective archival layer is at the bottom, composed of magnetic tape libraries and automated tape robots, with the highest capacity and latency. The movement between layers is automatically triggered. Haplotype A haplotype refers to the collection of genetic material within an organism that is inherited together. Haplotype may be used to describe a few loci or any number of chromosomes (a chromosome-scale haplotype). Hi-C Sequencing-based method used to study three-dimensional interactions among chromatin regions by measuring the frequency of contact between pairs of loci. Since contact frequency is related to the distance between a pair of loci, Hi-C linking information is used to help with scaffolding stages during a genome assembly process. Hi-C map / graph production The occurrence and frequency of Hi-C contacts are analysed and used in assembly scaffolding. They are typically visualised in Hi-C 2D heatmaps with the full genome sequence on the X and Y axis and a markup for each observed contact. HiFi reads HiFi (High Fidelity) PacBio reads are produced by taking multiple sequences of the same molecule to provide a consensus sequence that is usually 12-20kbp long and has a low error rate (>99.9 % consensus accuracy). INSDC International Nucleotide Sequence Database Collaboration (https://www.insdc.org/) is an initiative between the DDBJ, EMBL-EBI and NCBI that together act as a global repository of sequence data and associated metadata, and provide tools and services that allow access to genomic resources. ITIC IT & Infrastructure Committee IsoSeq This is a sequencing protocol developed by PacBio that aims to sequence full-length transcripts using the accurate, long read capabilities of PacBio HiFi technology. IsoSeq data facilitate analysis of transcriptomes and genome annotation by identifying full-length isoforms of transcripts. JEDI / DEIJ Justice, Equity, Diversity, and Inclusion Subcommittee K-mer A K-mer is a DNA sequence of length k; for example, the sequence AGCT contains the 3-mers (K-mers of length 3) AGC and GCT. Library DNA, cDNA, or RNA that has been prepared for NGS within (usually) a specific size range and containing adapters, which are designed to be appropriate for (a) specific sequencing platform(s). M&C Media & Communications Committee Metadata A collection of data that provides contextual information about multiple characteristics of other, corresponding original data. ONT Oxford Nanopore Technologies (ONT; https://nanoporetech.com/) is a next generation sequencing technology whereby sequence data are generated from the changes in current that occur as single-stranded DNA or RNA molecules pass through nanoscale protein pores (nanopores). ONT provides long read data (up to several megabases) that facilitate genome assembly. Omni-C Modified version of Hi-C that uses a sequence-independent endonuclease during its protocol to produce more even sequence coverage increasing overall resolution. Open data Open data are freely accessible and unrestricted data that can be accessed, used,reused and shared with third parties for any purpose. PUID A permanent unique identifier is a unique label for an object that does not change, such as the Digital Object Identifier (DOI) attached with a scientific publication. PacBio Pacific Biosciences (PacBio; https://www.pacb.com/) is a single-molecule, real time (SMRT) next generation sequencing technology in which sequence data are generated by fluorescent light emission that occurs when a DNA polymerase adds nucleotides. PacBio produces long read data (tens of kilobases) that facilitate genome assembly. RNA-Seq RNA-Seq is a technique that determines the complete or partial RNA sequence using NGS. The RNA expression profiles vary in different tissues of the same organism and can be influenced by physiopathological circumstances. RNA-Seq data facilitate genome assembly by providing empirical evidence for annotation of transcribed regions. Reference genome An accepted standard representation of an organism’s DNA sequence. High-quality reference genomes typically have high completeness (chromosome-level with few gaps in sequence), few errors, and are annotated and accessible. A reference genome serves as a tool for alignment-based analyses, such as variant calling or RNAseq, and has many other applications, for example, phylogenetics and evolutionary relationships, identification of genes and variants, functional analysis and comparative genomics. Reference genomes referred to as “drafts” are those that are under active construction and refinement, and not yet finalised through manual curation. SAC Sequencing and Assembly Committee SOP A standard operating procedure (SOP) is a document that provides detailed instructions on how to perform an activity, outlining the step-by-step process required for its execution. SRA Sequence Read Archive SSP Sampling & Sample Processing (Committee) TKT Training & Knowledge Transfer Committee Transcriptome A transcriptome is a set of aligned RNAseq reads representing RNA collected from a sample or collection of samples. This includes both protein-coding and non-coding transcripts. Voucher A voucher specimen is a permanently preserved object (either whole or in part, and/or physical or digital) of an identified organism (verified by a recognised expert) and which is deposited in an accessible facility or database. A voucher provides physical evidence about any specimen’s taxonomic identity. Voucher deposition is a best practice for conducting biodiversity genomics research. References The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics (Glossary)- bioRxiv 2023.09.25.559365; doi: https://doi.org/10.1101/2023.09.25.559365 How genomics can help biodiversity conservation; doi: https://doi.org/10.1016/j.tig.2023.01.005 Refererences

  • SAC - Sequencing and Assembly Committee

    < Back SAC - Sequencing and Assembly Committee assembly@erga-biodiversity.eu The Sequencing and Assembly Committee (SAC) aims to foster collaboration within the ERGA community by organising, coordinating efforts, and providing a platform for exchanging ideas in topics regarding genome assembly methods. SAC actively engages with other ERGA committees, genome projects, and consortia to maintain updated workflows and develop standardised pipelines aligned with ERGA's mission and quality requirements. Furthermore, the committee is establishing a framework for assembly evaluation to ensure high-quality standards are met and addresses complex cases by seeking feedback from the community. ERGA-SAC is also committed to producing guidelines, materials, SOPs, and best practices to facilitate ongoing improvement and knowledge sharing. ( V.1.0 02.05.2023) Chair Camila Mazzoni Coordinator Diego de Panis Steering Committee Tyler Alioto Nadège Guiglielmoni Catherine Breton Kerstin Howe Looking for assistance and guidance with how to assemble a genome? The Sequencing and Assembly Committee can help! Join our Slack Channel! Here you can post your questions and start conversations with the Sequencing and Assembly community from the ERGA consortium. Use our resources! Here we have a collection of Genome Assembly Workshops collected and curated by the members of the ERGA SAC. Join our mailing list! Send an email to assembly@erga-biodiversity.eu to join the ERGA Sequencing and Assembly mailing list and get regular updates about the activities of the SAC. Present at our meetings! Send and email to assembly@erga-biodiversity.eu to request a slot to present at a SAC meeting if you would like feedback on your project. We can advise on steps to improve an assembly or potential pipelines that you may find useful. Resources 💡 ERGA Knowledge Hub ▶️ ERGA SAC Youtube Playlist 🔗 Galaxy workflow for de-novo genome assembly using PacBio HiFi and HiC data 🔗 Galaxy workflow for de-novo genome assembly using ONT, Illumina WGS and HiC data Press Releases Genetic adaptation of Northern chamois ecotypes to climate change and habitat loss Events What happened at the UN Biodiversity Conference (COP 16) on Digital Sequence Information? Can I still use open public genetic sequence data? ERGA Newsletter ERGA News #24 - November 2024 Events Evolutionary transcriptomics in brown algae

bottom of page