MobiVision Epigenomics - ChIP-seq Results Interpretation

Output Files

The default output files of MobiVision ChIP-seq analysis include a total of 25 files. Note that the SAMPLEID_out directory is automatically generated by the software and does not require user specification:

  1. _flagdone: Flag fileindicating successful completion of the analysis.
  2. logs: Directory containingrun logs.
  3. run_analysis_cmds.txt:Complete record of command-line instructions executed during the analysis.
  4. SAMPLEID_out: Root directoryfor all output files.
  5. filtered_cell_fragments_matrix:Root directory for the filtered cell-by-fragment count matrix.
  6. filtered_cell_peaks_matrix:Root directory for the filtered cell-by-peak count matrix.
  7. SAMPLEID.bam: Alignmentoutput file (BAM format).
  8. SAMPLEID.bw: Visualizationtrack file (BigWig format) for alignment results.
  9. SAMPLEID.filtered.bed.gz:Deduplicated and filtered fragment file (compressed BED format).
  10. SAMPLEID_Report.html: Qualitycontrol report in HTML format.
  11. SAMPLEID_Report.json: Qualitycontrol report in JSON format.
  12. summary.csv: Librarystatistics summary in CSV format.
  13. raw_cell_peaks_matrix: Rootdirectory for the unfiltered cell-by-peak count matrix.
  14. fragmentsInCells.tsv.gz:Filtered fragment file after cell selection (compressed TSV format).
  15. fragmentsInCells.tsv.gz.tbi:Index file for fragmentsInCells.tsv.gz.
  16. SAMPLEID.narrowPeak/broadPeak:Peak calls for the entire library (in narrowPeak or broadPeak format).
  17. peaks_annotation.tsv.gz:Annotation information for detected peaks (compressed TSV format).
  18. statistics.all.csv:Comprehensive quality control metrics in CSV format.

image.png

Quality Control Report Interpretation

Upon completion of the MobiVision ChIP-seq analysis, an HTML quality control report is generated. The report consists of the following six sections:

01 Overview

image.png

The Sample section includes the following information:

Sample ID: Sample name

Reference: Reference genome name

Library: Library name + Antibodyname

Pipeline Version: Analysis pipeline version

02 Cells

image.png

The "Cells" Section

The left panel displays the Barcode Rank Plot, while the right panel shows cell-related metrics, which are consistent with the content in the "Overview" section.

This plot is generated by:

1. Counting the number of valid fragments corresponding to eachcell barcode.

2. Sorting cell barcodes in descending order based on validfragment counts (e.g., the cell barcode with the highest fragment count isassigned rank 1, and so on).

3. Using the cell barcode rank as the x-axis andthe number of valid fragments per cell barcode as they-axis.

4. Users can click the question mark icon (?) in the upper right corner of the section (also available in other sections) to access detailed help information, as shown below:

image.png

03 Mapping

image.png

The Mapping Section

The left panel displays the Fragment Length Distribution Plot. The x-axis represents the length of fragments in the library, and the y-axis represents the number of fragments. This plot visually summarizes the distribution of DNA insert fragments of varying lengths.

The right panel shows library mapping statistics, including alignment-related metrics and quality control information.

04 Targeting

image.png

Targeting Section

Left Panel: Peaks Targeting Plot

● Each point represents a unique cell barcode

● X-axis: Number of fragments percell barcode

● Y-axis: Percentage of fragmentsoverlapping with peaks in each cell

● Color coding: Distinct colorsdifferentiate between 'Cells' and 'Non-cells'

Right Panel: Peak Information

● Displays detailed information about peaks detected in thelibrary

● Includes peak statistics and genomic annotations

Interactive Features

● Click the question mark icon (?) in the upper right corner fordetailed help information

● Hover over data points to view specific numerical values

05 Sequencing

image.png

Sequencing Section

Left Panel: Library Complexity Plot

● The plot demonstrates the average number of fragments detectedper cell at different sequencing depths (obtained through randomdownsampling)

● X-axis: Sequencing depth

● Y-axis: Average number of fragments per cell

● The curve plateaus when all fragments in the library have beencaptured, indicating that additional sequencing depth will not yield morefragments per cell

Right Panel: Library Information Results

● Presents key sequencing metrics and library preparationstatistics

● Includes data such as total sequencing depth, libraryconcentration, and other relevant quality indicators

Technical Note
The random downsampling approach allows for evaluating the relationship between sequencing effort and fragment detection efficiency, helping to determine optimal sequencing depth for future experiments.

06 t-SNE Projection

image.png

The t-SNE Projection Section

The left panel displays the t-SNE Plot of Valid Fragments, which visualizes the number of valid fragments per cell. The X and Y axes represent the two-dimensional embedding coordinates generated by the t-SNE algorithm. Each point corresponds to a cell — a higher number of valid fragments in a cell indicates stronger ChIP signal detection.

The right panel shows the Clustered t-SNE Plot, illustrating the distribution of cells based on clustering algorithms. Each point represents a cell, and points with similar colors indicate that the corresponding cells exhibit highly similar ChIP signal profiles.