Saved in:
| Main Authors: | , , , , , , , , , , , , , , , |
|---|---|
| Format: | Recurso digital |
| Language: | |
| Published: |
Zenodo
2025
|
| Online Access: | https://doi.org/10.5281/zenodo.17419882 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866902159850733568 |
|---|---|
| author | Coss-Navarrete, Evelia Lorena Sofia Salazar-Magaña Diego Ramirez-Espinosa Hernández-Ledesma, Ana Laura Lizbet Tinajero-Nieto Torres-Valdez, Estefanía Peña Ayala, Angelica Guillermo Félix-Rodríguez Frontana, Gabriel Jair Garcia-Sotelo Trynka, Gosia Rosetti, Florencia Fernandez-Valverde, Selene L Gutierrez-Arcelus, Maria Alpízar-Rodríguez, Deshire Medina Rivera, Alejandra |
| author_facet | Coss-Navarrete, Evelia Lorena Sofia Salazar-Magaña Diego Ramirez-Espinosa Hernández-Ledesma, Ana Laura Lizbet Tinajero-Nieto Torres-Valdez, Estefanía Peña Ayala, Angelica Guillermo Félix-Rodríguez Frontana, Gabriel Jair Garcia-Sotelo Trynka, Gosia Rosetti, Florencia Fernandez-Valverde, Selene L Gutierrez-Arcelus, Maria Alpízar-Rodríguez, Deshire Medina Rivera, Alejandra |
| contents | <p><strong>Aim</strong>: Find differentially activated regulons in SLE vs Controls in bulk-RNA seq data</p> <p>This code was adapted from the pipeline described in the article: <br>“<a href="https://www.frontiersin.org/journals/rna-research/articles/10.3389/frnar.2024.1334873/full">Identification of regulons modulating the transcriptional response to SARS-CoV-2 infection in humans</a>” by Padilla-Gálvez et al., 2024. <br>Script: <a href="https://github.com/amedina-liigh/PulmonDB_COVID/blob/main/COVID/bulk/reports/03-pyscenic/pyscenic_multiruns_batchcorr-sep.rmd">pyscenic_multiruns_batchcorr-sep.rmd</a></p> <p><strong>Source files:</strong></p> <ul> <li><em>toRun/counts/vsd_all.RData</em>: R object containing variance-stabilized expression data, typically generated from DESeq2. Used for downstream visualization and differential expression analysis.</li> <li><em>toRun/metadata/named-transcripts-info_v2.RData</em>: Annotated transcript metadata with columns: <ul> <li>ID: Transcript identifier (Ensembl IDs)</li> <li>gene_name: Associated gene symbol</li> <li>transcript_biotype: Functional category (e.g. protein-coding, lncRNA, pseudogene)</li> </ul> </li> <li><em>toRun/counts/reduced_counts.csv</em>: Filtered expression matrix containing variance-stabilized counts for unique genes across all samples.</li> </ul> <p><strong>1. Run SCENIC pipeline</strong></p> <p>Input files: </p> <ul> <li>Config file: <em>toRun/sle_scenic.config</em></li> <li>Input data: <em>toRun/counts/reduced_counts.loom</em></li> <li>Resources folder: <em>toRun/resources</em></li> </ul> <p>Output file: </p> <ul> <li><em>resultsRun/output/multi_runs_looms/multi_runs_regulons_auc_trk.loom:</em> SCENIC output loom <br> + RegulonsAUC: AUC scores for regulon activity across cells or samples<br> + CellID: Unique identifiers for each sample<br> + Regulons: Binary matrix of transcription factor–target relationships<br> + Gene: Gene names associated with each regulon</li> <li><em>resultsRun/output/multi_runs_aucell/multi_runs_regulons_auc_trk.tsv: </em>Tabulated AUC scores <ul> <li>RegulonsAUC: AUC scores for regulon activity across cells or samples</li> <li>CellID: Unique identifiers for each sample</li> <li>RegulonName: Transcription factor names associated with each regulon</li> </ul> </li> <li><em>resultsRun/output/multi_runs_cistarget/multi_runs_features_trk.csv.gz: Regulatory feature scores</em> <ul> <li>Gene-motif associations: Motif enrichment scores for candidate target genes</li> <li>Rankings: CisTarget ranking of regulatory evidence per gene</li> <li>Motif: Motif ID and annotation used to infer regulation</li> <li>Format: Compressed CSV for downstream analysis or validation</li> </ul> </li> <li><em>resultsRun/output/multi_runs_cistarget/multi_runs_regulons_trk.pkl.gz: Regulon object (Python pickle)</em> <ul> <li>Regulons: Dictionary of transcription factors and their predicted target genes</li> <li>Metadata: Includes motif support and confidence scores</li> <li>Format: Compressed Python object for reuse in SCENIC or custom workflows</li> </ul> </li> <li><em>resultsRun/output/multi_runs_regulons_trk: </em> <ul> <li>Folder with regulon-level outputs</li> <li>GeneLists: Individual files listing target genes per regulon</li> <li>MotifEnrichment: Motif-level evidence supporting TF–target relationships</li> <li>Intermediate files: Support reproducibility and allow inspection of regulon construction steps</li> </ul> </li> </ul> <p><strong>2. Get differentiated regulons</strong></p> <p>Script:</p> <ul> <li>r<em>esultsRun/scripts/dif_regulons.py</em>: This script identifies differentially active transcriptional regulons between two groups of samples—typically SLE vs control—using SCENIC output.</li> </ul> <p>Input file:</p> <ul> <li><em>resultsRun/output/multi_runs_looms/multi_runs_regulons_auc_trk.loom</em>: SCENIC output loom</li> <li><em>resultsRun/metadata/metadata.csv:</em> Sample-level metadata file. Must include: <ul> <li>sample_ID: Unique identifier matching CellID in the loom file</li> <li>Group: Experimental condition label (e.g. "SLE" or "Ctrl")</li> </ul> </li> </ul> <p>Output files:</p> <ul> <li><em>resultsRun/results/AUC_mtx.csv:</em> Raw AUC matrix for all cells/samples and regulons</li> <li><em>resultsRun/results/tf_targets.csv</em>: Binary matrix of transcription factor targets per regulon</li> <li><em>resultsRun/results/histogram_SLE_Ctrl_regulons.png</em>: Histogram of adjusted p-values for differential regulons</li> <li><em>resultsRun/results/difregs_SLE_Ctrl.csv</em>: Final merged results: regulon name, p-values, adjusted p-values, log2FC </li> </ul> |
| format | Recurso digital |
| id | zenodo_https___doi_org_10_5281_zenodo_17419882 |
| institution | Zenodo |
| language | |
| publishDate | 2025 |
| publisher | Zenodo |
| record_format | zenodo |
| spellingShingle | Regulatory Network Analysis of SLE-Associated Regulon Activity using SCENIC pipeline Coss-Navarrete, Evelia Lorena Sofia Salazar-Magaña Diego Ramirez-Espinosa Hernández-Ledesma, Ana Laura Lizbet Tinajero-Nieto Torres-Valdez, Estefanía Peña Ayala, Angelica Guillermo Félix-Rodríguez Frontana, Gabriel Jair Garcia-Sotelo Trynka, Gosia Rosetti, Florencia Fernandez-Valverde, Selene L Gutierrez-Arcelus, Maria Alpízar-Rodríguez, Deshire Medina Rivera, Alejandra <p><strong>Aim</strong>: Find differentially activated regulons in SLE vs Controls in bulk-RNA seq data</p> <p>This code was adapted from the pipeline described in the article: <br>“<a href="https://www.frontiersin.org/journals/rna-research/articles/10.3389/frnar.2024.1334873/full">Identification of regulons modulating the transcriptional response to SARS-CoV-2 infection in humans</a>” by Padilla-Gálvez et al., 2024. <br>Script: <a href="https://github.com/amedina-liigh/PulmonDB_COVID/blob/main/COVID/bulk/reports/03-pyscenic/pyscenic_multiruns_batchcorr-sep.rmd">pyscenic_multiruns_batchcorr-sep.rmd</a></p> <p><strong>Source files:</strong></p> <ul> <li><em>toRun/counts/vsd_all.RData</em>: R object containing variance-stabilized expression data, typically generated from DESeq2. Used for downstream visualization and differential expression analysis.</li> <li><em>toRun/metadata/named-transcripts-info_v2.RData</em>: Annotated transcript metadata with columns: <ul> <li>ID: Transcript identifier (Ensembl IDs)</li> <li>gene_name: Associated gene symbol</li> <li>transcript_biotype: Functional category (e.g. protein-coding, lncRNA, pseudogene)</li> </ul> </li> <li><em>toRun/counts/reduced_counts.csv</em>: Filtered expression matrix containing variance-stabilized counts for unique genes across all samples.</li> </ul> <p><strong>1. Run SCENIC pipeline</strong></p> <p>Input files: </p> <ul> <li>Config file: <em>toRun/sle_scenic.config</em></li> <li>Input data: <em>toRun/counts/reduced_counts.loom</em></li> <li>Resources folder: <em>toRun/resources</em></li> </ul> <p>Output file: </p> <ul> <li><em>resultsRun/output/multi_runs_looms/multi_runs_regulons_auc_trk.loom:</em> SCENIC output loom <br> + RegulonsAUC: AUC scores for regulon activity across cells or samples<br> + CellID: Unique identifiers for each sample<br> + Regulons: Binary matrix of transcription factor–target relationships<br> + Gene: Gene names associated with each regulon</li> <li><em>resultsRun/output/multi_runs_aucell/multi_runs_regulons_auc_trk.tsv: </em>Tabulated AUC scores <ul> <li>RegulonsAUC: AUC scores for regulon activity across cells or samples</li> <li>CellID: Unique identifiers for each sample</li> <li>RegulonName: Transcription factor names associated with each regulon</li> </ul> </li> <li><em>resultsRun/output/multi_runs_cistarget/multi_runs_features_trk.csv.gz: Regulatory feature scores</em> <ul> <li>Gene-motif associations: Motif enrichment scores for candidate target genes</li> <li>Rankings: CisTarget ranking of regulatory evidence per gene</li> <li>Motif: Motif ID and annotation used to infer regulation</li> <li>Format: Compressed CSV for downstream analysis or validation</li> </ul> </li> <li><em>resultsRun/output/multi_runs_cistarget/multi_runs_regulons_trk.pkl.gz: Regulon object (Python pickle)</em> <ul> <li>Regulons: Dictionary of transcription factors and their predicted target genes</li> <li>Metadata: Includes motif support and confidence scores</li> <li>Format: Compressed Python object for reuse in SCENIC or custom workflows</li> </ul> </li> <li><em>resultsRun/output/multi_runs_regulons_trk: </em> <ul> <li>Folder with regulon-level outputs</li> <li>GeneLists: Individual files listing target genes per regulon</li> <li>MotifEnrichment: Motif-level evidence supporting TF–target relationships</li> <li>Intermediate files: Support reproducibility and allow inspection of regulon construction steps</li> </ul> </li> </ul> <p><strong>2. Get differentiated regulons</strong></p> <p>Script:</p> <ul> <li>r<em>esultsRun/scripts/dif_regulons.py</em>: This script identifies differentially active transcriptional regulons between two groups of samples—typically SLE vs control—using SCENIC output.</li> </ul> <p>Input file:</p> <ul> <li><em>resultsRun/output/multi_runs_looms/multi_runs_regulons_auc_trk.loom</em>: SCENIC output loom</li> <li><em>resultsRun/metadata/metadata.csv:</em> Sample-level metadata file. Must include: <ul> <li>sample_ID: Unique identifier matching CellID in the loom file</li> <li>Group: Experimental condition label (e.g. "SLE" or "Ctrl")</li> </ul> </li> </ul> <p>Output files:</p> <ul> <li><em>resultsRun/results/AUC_mtx.csv:</em> Raw AUC matrix for all cells/samples and regulons</li> <li><em>resultsRun/results/tf_targets.csv</em>: Binary matrix of transcription factor targets per regulon</li> <li><em>resultsRun/results/histogram_SLE_Ctrl_regulons.png</em>: Histogram of adjusted p-values for differential regulons</li> <li><em>resultsRun/results/difregs_SLE_Ctrl.csv</em>: Final merged results: regulon name, p-values, adjusted p-values, log2FC </li> </ul> |
| title | Regulatory Network Analysis of SLE-Associated Regulon Activity using SCENIC pipeline |
| url | https://doi.org/10.5281/zenodo.17419882 |