At the current time one of the standard file formats for chip genotype data is a ped/map or plink binary ped (bed/bim/fam) format. This is normally converted to VCF format for imputation and further analysis.

An issue with this conversion is that Plink currently sets allele 1 as the minor and allele 2 as the major allele and whilst this is somewhat close the reference allele definition (in that reference is usually the common allele) it is not always the case.

Therefore these files are designed to be used in plink with the –reference-allele (or –a1-allele, –a2-allele) command, allowing allele 1 or 2 to be set as the reference to ensure it is correctly assigned in any resulting VCF conversion.

NOTE: at the present time the files are not 100% correct for indels, whilst the allele assignment is correct, the allele listed may be truncated and this may cause issues with plink.