Saved in:
Bibliographic Details
Main Author: Luthra, Ishika
Format: Recurso digital
Language:
Published: Zenodo 2025
Online Access:https://doi.org/10.5281/zenodo.16815570
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • <div> <p>The goal of this Evaluator is to assess the consistency between forward and reverse complement sequence point predictions by calculating the pearson r correlation between them. It ensures that Predictors generating accessibility scores produce similar results for both strands. This Consistency Evaluator specifically requests point accessibility predictions for the K562 cell type in <em>Homo Sapiens</em>.</p> <p>The <code>Consistency_evaluator_point_K562.sif</code> contains the following:</p> <ul> <li>The scripts required to process the data and connect to predictors in the GAME API</li> </ul> <p>The <code>/evaluator_data</code> folders contains:</p> <ul> <li><code>all_consistency_data.csv</code> sequence file that contains 900 sequences (251 bp each)<br><br> <ol> <li>150 randomly sampled peak sequences from iPSC ATAC-seq data and their reverse complements (300 sequences total)</li> <li>Mononucleotide shuffled versions of the original sequences and their reverse complement (300 sequences total)</li> <li>Dinucleotide shuffled versions of the original sequences and their reverse complement (300 sequences total)</li> </ol> </li> </ul> <p>The folder also contains the data and script to recreate the final <code>.csv</code> file</p> <ul> <li>Original bed file pulled from ENCODE - ATAC seq data for iPSC: <code>ENCFF121CAA.bed</code></li> <li>The <code>sequence_design.py</code> script that  <ol> <li>Pulls the center 251bp from 150 random peaks from the bed file and creates a <code>.fasta</code> file</li> <li>The fasta file is used to run a tool called Biasaway to create the mono and dinucleotide shuffled sequences</li> <li>The original and shuffled sequence files are read in and written to one file (<code>all_consistency_data.csv)</code></li> </ol> </li> </ul> <p>How to run:</p> <div><code>apptainer run --containall -B /path_to/evaluator_data/:/evaluator_data -B /path_to/prediction_folder/:/predictions Consistency_evaluator_point_K562.sif HOST PORT /predictions</code></div> <p>Notes:</p> <ul> <li>This Evaluator was designed to be used with any Predictor that can return accessbility predictions in homo sapiens to test its consistency</li> <li>The <code>all_consistency_data.csv</code> sequence file is copied into the container and is not created everytime the container is run but we include the code in case users are curious how it was created</li> </ul> <p>Additional information regarding the API can be found here: <a href="https://github.com/de-Boer-Lab/Genomic-API-for-Model-Evaluation">https://github.com/de-Boer-Lab/Genomic-API-for-Model-Evaluation</a></p> </div>