Збережено в:
| Автори: | , , , , |
|---|---|
| Формат: | Recurso digital |
| Мова: | Англійська |
| Опубліковано: |
Zenodo
2026
|
| Предмети: | |
| Онлайн доступ: | https://doi.org/10.5281/zenodo.20056424 |
| Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
Зміст:
- <p>This is the repeat annotation data generated in "Biological implications of a detailed repeat annotation in Octopus vulgaris" (https://doi.org/10.64898/2026.03.03.709284) for the Octopus vulgaris ASM119413v2 assembly (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_001194135.2/). This includes:</p> <p>-the GFF file of the repeat annotation (OctVulg_genome_annotation_only.filteredRepeats.gff)</p> <p>-this is the raw annotation before TE sequences smaller than 100 bp were filtered out for calculating summary information </p> <p>-the FASTA file of reference TE sequences used as well as the new, curated TE and other repeat consensus sequences that were generated (O_vulgaris_and_reference_repeats_Oct24-2.fasta)</p> <p>-sequence headers for reference sequences start with 'REFERENCE', and with 'Ovulg' for new consensus sequences. This includes several consensus sequences of Zinc-finger gene arrays that were noticed during curation (#ZF-array) as well as satellites, unknown repeats and some RNA loci. Characters before the '#' symbol are strings which are unique to each sequence</p> <p>The R markdown file (GBE_O_vulgaris_repeat_annotation.Rmd) contains the code for filtering the GFF file for short TE hits, for young elements and for recreating Figures 1 and 2, including the hotspot/coldspot analysis. Input required for this pipeline is the GFF file of the repeat annotation (OctVulg_genome_annotation_only.filteredRepeats.gff) and the Octopus vulgaris ASM119413v2 assembly (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_001194135.2/)</p>