GWAS of UK Biobank biomarker measurements

With the UK Biobank’s release of biochemical marker measurements in urine, red blood cell, and serum samples collected from study participants, the Neale lab has updated our collection of publicly-available GWAS results to include summary statistics from GWAS conducted on both raw and inverse-rank normalized versions of these biomarker measurements. Consistent with our previous summary statistics, the GWAS was restricted 361,194 samples of white-British ancestry, and 13.7 million QC-passing SNPs.

We’ve released GWAS results for the following biomarkers. The file can be found in the UKBB GWAS Imputed v3 - File Manifest Release 20180731 by searching in the ‘phenotype column’ for the terms below:

  • Alanine aminotransferase

  • Albumin

  • Alkaline phosphatase

  • Apoliprotein A

  • Apoliprotein B

  • Aspartate aminotransferase

  • C-reactive protein

  • Calcium

  • Cholesterol

  • Creatinine

  • Creatinine (enzymatic)

  • Cystatin C

  • Direct bilirubin

  • Gamma glutamyltransferase

  • Glucose

  • Glycated haemoglobin

  • HDL cholesterol

  • Insulin-like growth factor-1 (IGF-1)

  • LDL cholesterol

  • Lipoprotein (a)

  • Microalbumin

  • Oestradiol

  • Phosphate

  • Potassium

  • Rheumatoid factor

  • Sex hormone binding globulin (SHBG)

  • Sodium

  • Testosterone

  • Total bilirubin

  • Total protein

  • Triglycerides

  • Urate

  • Urea

  • Vitamin D

As described in section 6.2 of the UK Biobank’s quality control document accompanying the biomarker measurements (, some serum samples were inadvertently diluted during initial sample collection and processing. The UK Biobank attempted to quantify the level of dilution in the released measurements by including an estimated dilution factor for each sample.

To assess the impact of dilution on biomarker GWAS results, we conducted two separate GWAS for each biomarker: one with the set of covariates included in the rest of our GWAS (age, sex, age^2, age*sex, age*sex^2, first 20 PCs), and another with those covariates plus the estimated sample dilution factor as an additional covariate.

We compared the inflation of test statistics seen in both GWAS versions by looking at both the genomic inflation factor (lambda_) and the LD score regression intercept term:


In the two GWAS models conducted, the plots above do not show a material reduction in test statistic inflation by including the estimated sample dilution factor as a covariate in our model. Therefore, to maintain consistency, we decided to share the biomarker GWAS results using the association model with the same set of covariates used in the model used to produce our other publicly-available GWAS summary statistics and not the model that includes the additional estimated sample dilution factor covariate.

Authored by Liam Abbott, with input from Daniel Howrigan

Daniel Howrigan