Application of SNP reduction approaches and random forest for the identification of population informative markers in cosmopolitan and local cattle breeds
In livestock, single nucleotide polymorphism genotyping arrays have been used to differentiate breeds and populations for several downstream applications, including breed allocation of individuals, breeds of origin of crossbred animals, authentication of mono breed products, comparative analyses of selection signatures among several other uses. We already tested a combination of principal component analysis (PCA), used as preselection method, and random forest (RF) used as classification method to assign cosmopolitan Italian breeds with no or very low error rate. In this work, we increased the number of breeds and approaches, to have a more comprehensive view of the strategies available and…