Збережено в:
Бібліографічні деталі
Автори: Bonar, Maegwin, Elliott, Tyler A., Ahmadi, Mirza A M, Cottenie, Karl, Linquist, Stefan
Формат: Recurso digital
Мова:Англійська
Опубліковано: Zenodo 2026
Предмети:
Онлайн доступ:https://doi.org/10.5281/zenodo.20056424
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Зміст:
  • <p>This is the repeat annotation data generated in  "Biological implications of a detailed repeat annotation in Octopus vulgaris" (https://doi.org/10.64898/2026.03.03.709284) for the Octopus vulgaris ASM119413v2 assembly  (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_001194135.2/). This includes:</p> <p>-the GFF file of the repeat annotation (OctVulg_genome_annotation_only.filteredRepeats.gff)</p> <p>-this is the raw annotation before TE sequences smaller than 100 bp were filtered out for calculating summary information </p> <p>-the FASTA file of reference TE sequences used as well as the new, curated TE and other repeat consensus sequences that were generated (O_vulgaris_and_reference_repeats_Oct24-2.fasta)</p> <p>-sequence headers for reference sequences start with 'REFERENCE', and with 'Ovulg' for new consensus sequences. This includes several consensus sequences of Zinc-finger gene arrays that were noticed during curation (#ZF-array) as well as satellites, unknown repeats and some RNA loci. Characters before the '#' symbol are strings which are unique to each sequence</p> <p>The R markdown file (GBE_O_vulgaris_repeat_annotation.Rmd) contains the code for filtering the GFF file for short TE hits, for young elements and for recreating Figures 1 and 2, including the hotspot/coldspot analysis. Input required for this pipeline is the GFF file of the repeat annotation (OctVulg_genome_annotation_only.filteredRepeats.gff) and the Octopus vulgaris ASM119413v2 assembly  (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_001194135.2/)</p>