Entries by Will Rayner

Genotype chip strand and position update files

To impute and meta-analyse multiple data sets it is essential that the data are aligned to a common reference, almost always the forward strand of the current human genome build. To do this and/or to move data between different builds of the human genome I have created a set of files (here) for the most […]

Files to update Ref/Alt on Illumina chips

At the current time one of the standard file formats for chip genotype data is a ped/map or plink binary ped (bed/bim/fam) format. This is normally converted to VCF format for imputation and further analysis. An issue with this conversion is that Plink currently sets allele 1 as the minor and allele 2 as the […]

1000G/HRC pre-imputation data set validation

Rigours data quality control prior to imputation is vital to ensure high quality output, to simplify validating that this QC has been done we have developed a program to compare a plink .bim file against the HRC or 1000G reference panel SNP list. The program produces an overall summary as well as a set of […]

A/B to TOP strand conversion files

Some programs such as zCall produce output with the alleles labelled as A/B. The files created on the link below are to update the A/B notation to the TOP alleles, thereby allowing the use of the strand and position files (Strand Files) to generate a data set on the forward strand of genome build of […]