6533b7d4fe1ef96bd1261dbc
RESEARCH PRODUCT
Random forests, a novel approach for discrimination of fish populations using parasites as biological tags.
John BarrettJuan Antonio RagaAneta KostadinovaAneta KostadinovaDiana Perdiguero-alonsoFrancisco E. Monterosubject
PopulationPopulation DynamicsSample (statistics)Host-Parasite InteractionsFish DiseasesGadusAnimalsParasiteseducationAtlantic Oceaneducation.field_of_studyArtificial neural networkbiologybusiness.industrySampling (statistics)Pattern recognitionbiology.organism_classificationLinear discriminant analysisRandom forestFisheryStatistical classificationInfectious DiseasesGadus morhuaParasitologyArtificial intelligencebusinessAlgorithmsdescription
Due to the complexity of host-parasite relationships, discrimination between fish populations using parasites as biological tags is difficult. This study introduces, to our knowledge for the first time, random forests (RF) as a new modelling technique in the application of parasite community data as biological markers for population assignment of fish. This novel approach is applied to a dataset with a complex structure comprising 763 parasite infracommunities in population samples of Atlantic cod, Gadus morhua, from the spawning/feeding areas in five regions in the North East Atlantic (Baltic, Celtic, Irish and North seas and Icelandic waters). The learning behaviour of RF is evaluated in comparison with two other algorithms applied to class assignment problems, the linear discriminant function analysis (LDA) and artificial neural networks (ANN). The three algorithms are used to develop predictive models applying three cross-validation procedures in a series of experiments (252 models in total). The comparative approach to RF, LDA and ANN algorithms applied to the same datasets demonstrates the competitive potential of RF for developing predictive models since RF exhibited better accuracy of prediction and outperformed LDA and ANN in the assignment of fish to their regions of sampling using parasite community data. The comparative analyses and the validation experiment with a 'blind' sample confirmed that RF models performed more effectively with a large and diverse training set and a large number of variables. The discrimination results obtained for a migratory fish species with largely overlapping parasite communities reflects the high potential of RF for developing predictive models using data that are both complex and noisy, and indicates that it is a promising tool for parasite tag studies. Our results suggest that parasite community data can be used successfully to discriminate individual cod from the five different regions of the North East Atlantic studied using RF.
year | journal | country | edition | language |
---|---|---|---|---|
2008-10-01 | International journal for parasitology |