6533b837fe1ef96bd12a203d

RESEARCH PRODUCT

Control of dataset bias in combined Affymetrix cohorts of triple negative breast cancer

Volkmar MüllerSven BeckerThomas KarnUwe HoltrichAchim RodyMarcus SchmidtLajos Pusztai

subject

Poolinglcsh:QH426-470MicroarrayPoolingComputational biologyMicroarrayBiologycomputer.software_genreBiochemistryBreast cancerBreast cancerData in BriefGeneticsmedicineddc:610Affymetrix microarraysTriple-negative breast cancerGene expression omnibusmedicine.diseaselcsh:GeneticsSample size determinationDataset biasMolecular MedicineGene expressionData miningcomputerBiotechnology

description

AbstractHeterogenous subtypes of breast cancer need to be analyzed separately. Pooling of datasets can provide reasonable sample sizes but dataset bias is an important concern. We assembled a combined dataset of 579 Affymetrix microarrays from triple negative breast cancer (TNBC) in Gene Expression Omnibus (GEO) series GSE31519. We developed a method for selecting comparable datasets and to control for the amount of dataset bias of individual probesets.

10.1016/j.gdata.2014.09.014http://dx.doi.org/10.1016/j.gdata.2014.09.014