Home > Research > Datasets > Comprehensive Supplementary Dataset: Viral Evol...

Electronic data

View graph of relations

Comprehensive Supplementary Dataset: Viral Evolution Analyses Including Recombination, Molecular Clock, and Selection Pressure

Dataset

Description

This repository contains all supplementary materials that support the full spectrum of viral evolutionary analyses in the study, including recombination detection, temporal signal assessment, substitution rate estimation, and positive selection detection across diverse viral taxa. The dataset is divided into five subdirectories:

Viruses – GenBank-derived viral FASTA sequences parsed through custom scripts, serving as primary data for downstream analyses.

Scripts – A set of 23 Python scripts used for sequence parsing, filtering, and analysis; details on usage are provided in the thesis (Chapter 2, Table 2, Section 2.1.2).

Recombination – Spreadsheet summarising recombination signals per alignment, filtered by temporal signal and taxonomy.

Molecular Clock – MEGA-aligned files and Newick trees used in TempEst analyses, along with temporal signal statistics.

Positive Selection – Input/output files from SLR runs, including phylip alignments, tree files, and control files per virus; accompanied by spreadsheets summarising sites under positive selection and InterProScan annotations.

This dataset provides the full methodological foundation to replicate or extend the evolutionary analyses presented in the project, which aims to understand how tempo and mode of evolution vary across viral taxonomic structures and whether these patterns can serve as taxonomic markers.
Date made available30/05/2025
PublisherLancaster University

Contact person