Home > Research > Publications & Outputs > Genotyping, characterization, and imputation of...

Electronic data

  • CYP2A6SV_text_clean_mar22_AL

    Accepted author manuscript, 287 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License


Text available via DOI:

View graph of relations

Genotyping, characterization, and imputation of known and novel CYP2A6 structural variants using SNP array data

Research output: Contribution to Journal/MagazineJournal articlepeer-review

  • Alec W R Langlois
  • Ahmed El-Boraie
  • Jennie G Pouget
  • Lisa Sanderson Cox
  • Jasjit S Ahluwalia
  • Koya Fukunaga
  • Taisei Mushiroda
  • Jo Knight
  • Meghan J Chenoweth
  • Rachel F Tyndale
<mark>Journal publication date</mark>31/08/2023
<mark>Journal</mark>Journal of Human Genetics
Number of pages9
Pages (from-to)533-541
Publication StatusPublished
Early online date14/04/23
<mark>Original language</mark>English


CYP2A6 metabolically inactivates nicotine. Faster CYP2A6 activity is associated with heavier smoking and higher lung cancer risk. The CYP2A6 gene is polymorphic, including functional structural variants (SV) such as gene deletions (CYP2A6*4), duplications (CYP2A6*1 × 2), and hybrids with the CYP2A7 pseudogene (CYP2A6*12, CYP2A6*34). SVs are challenging to genotype due to their complex genetic architecture. Our aims were to develop a reliable protocol for SV genotyping, functionally phenotype known and novel SVs, and investigate the feasibility of CYP2A6 SV imputation from SNP array data in two ancestry populations. European- (EUR; n = 935) and African- (AFR; n = 964) ancestry individuals from smoking cessation trials were genotyped for SNPs using an Illumina array and for CYP2A6 SVs using Taqman copy number (CN) assays. SV-specific PCR amplification and Sanger sequencing was used to characterize a novel SV. Individuals with SVs were phenotyped using the nicotine metabolite ratio, a biomarker of CYP2A6 activity. SV diplotype and SNP array data were integrated and phased to generate ancestry-specific SV reference panels. Leave-one-out cross-validation was used to investigate the feasibility of CYP2A6 SV imputation. A minimal protocol requiring three Taqman CN assays for CYP2A6 SV genotyping was developed and known SV associations with activity were replicated. The first domain swap CYP2A6-CYP2A7 hybrid SV, CYP2A6*53, was identified, sequenced, and associated with lower CYP2A6 activity. In both EURs and AFRs, most SV alleles were identified using imputation (>70% and >60%, respectively); importantly, false positive rates were <1%. These results confirm that CYP2A6 SV imputation can identify most SV alleles, including a novel SV.