Search results for "Inter-rater reliability"

showing 10 items of 33 documents

Assessment of the Tricuspid Valve Morphology by Transthoracic Real-Time-3D-Echocardiography

2005

Aim: To demonstrate the feasibility of transthoracic three-dimensional real-time echocardiography (3D-TTE) supplemental to routine assessments of the tricuspid valve and to analyze interrater agreement. Methods: Twenty healthy subjects and 74 patients with right ventricular failure were examined with conventional 2D and additionally 3D-TTE (SONOS 7500, Philips, Netherlands). The 3D exams were performed and recorded by one of two raters. The recordings were evaluated offline and independently by both raters for visualization of morphological and functional features of the tricuspid valve according to a subjective 3-point scale. Statistical analyses were performed for interrater agreement and…

Adultmedicine.medical_specialtyTime FactorsFunctional featuresEchocardiography Three-DimensionalReal time 3d echocardiographyInternal medicinemedicineHumansRadiology Nuclear Medicine and imagingProspective cohort studyHeart FailureObserver VariationTricuspid valvebusiness.industryHealthy subjectsTricuspid Valve InsufficiencyInter-rater reliabilitymedicine.anatomical_structureImaging qualitycardiovascular systemCardiologyFeasibility StudiesRight ventricular failureFemaleTricuspid ValveRadiologyCardiology and Cardiovascular MedicinebusinessEchocardiography
researchProduct

The reliability, distribution, and responsiveness of the Postural Control and Balance for Stroke Test

2005

Abstract Pyoria O, Talvitie U, Villberg J. The reliability, distribution, and responsiveness of the Postural Control and Balance for Stroke Test. Objectives To determine the inter- and intrarater reliability of the Postural Control and Balance for Stroke (PCBS) test and to assess its distribution and responsiveness to changes during 1-year follow-up. Design Intrarater reliability of the PCBS test was assessed by comparing the repeat ratings of videotaped test performances by each of the 5 raters. Interrater reliability was assessed by comparing the ratings of the videotaped test performances between the raters. Setting Hospital neurologic ward and outpatient department of physiotherapy as w…

Aged 80 and overMaleScore testmedicine.medical_specialtyWilcoxon signed-rank testbusiness.industryIntraclass correlationPostureRehabilitationStroke RehabilitationReproducibility of ResultsPhysical Therapy Sports Therapy and RehabilitationIntra-rater reliabilityMiddle AgedInter-rater reliabilityCronbach's alphaTask Performance and AnalysisPhysical therapyHumansOutpatient clinicMedicineFemalebusinessAgedBalance (ability)Archives of Physical Medicine and Rehabilitation
researchProduct

The Elephant in the Machine: Proposing a New Metric of Data Reliability and its Application to a Medical Case to Assess Classification Reliability

2020

In this paper, we present and discuss a novel reliability metric to quantify the extent a ground truth, generated in multi-rater settings, as a reliable basis for the training and validation of machine learning predictive models. To define this metric, three dimensions are taken into account: agreement (that is, how much a group of raters mutually agree on a single case)

Computer sciencekneeMachine learningcomputer.software_genrelcsh:TechnologyTask (project management)lcsh:Chemistry03 medical and health sciencesMagnetic resonance imaging0302 clinical medicine0504 sociologyGeneral Materials Science030212 general & internal medicinelcsh:QH301-705.5InstrumentationCompetence (human resources)MRNetReliability (statistics)Fluid Flow and Transfer ProcessesGround truthreliabilityBasis (linear algebra)Point (typography)lcsh:Tbusiness.industryComputer Science::Information RetrievalProcess Chemistry and Technology05 social sciencesGeneral Engineering050401 social sciences methodslcsh:QC1-999Computer Science ApplicationsInter-rater reliabilitymachine learninglcsh:Biology (General)lcsh:QD1-999lcsh:TA1-2040inter-rater agreementArtificial intelligenceMetric (unit)lcsh:Engineering (General). Civil engineering (General)businessground truthcomputerlcsh:PhysicsApplied Sciences
researchProduct

Concordance Analysis

2011

Background In this article, we describe qualitative and quantitative methods for assessing the degree of agreement (concordance) between two measuring or rating techniques. An assessment of concordance is particularly important when a new measuring technique is introduced.

Concordance analysisInter-rater reliabilitybusiness.industryConcordanceMedicineGeneral MedicineArtificial intelligencebusinesscomputer.software_genreObserver variationcomputerReference standardsNatural language processingDeutsches Ärzteblatt international
researchProduct

Statement validity assessment: Inter-rater reliability of criteria-based content analysis in the mock-crime paradigm

2005

Methods. Three raters were trained in CBCA. Subsequently, they analysed transcripts of 102 statements referring to a simulated theft of money. Some of the statements were based on experience and some were confabulated. The raters used 4-point scales, respectively, to judge the degree to which 18 of the 19 CBCA criteria were fulfilled in each statement. Results. The analysis of rater judgment distributions revealed that, with judgments of individual raters varying only slightly across transcripts, the weighted kappa coefficient, the product-moment correlation, and the intra-class correlation were inadequate indices of reliability. The Finn-coefficient and percentage agreement, which were cal…

CorrelationInter-rater reliabilityValidity assessmentCohen's kappaContent analysisStatement (logic)StatisticsPoison controlPsychologyApplied PsychologyReliability (statistics)Pathology and Forensic MedicineReliability engineeringLegal and Criminological Psychology
researchProduct

CHEESE HARDNESS ASSESSMENT BY EXPERTS AND UNTRAINED JUDGES

2001

Although expert assessment of food characteristics is recognized as a key step in product development, the use of consumer based measurements is sometimes recommended as an equivalent to the experts. From cognitive psychology, support of the role of perceptual learning is found in some instances, although this could not be relevant in others. To address this point performance analysis of experts and untrained panelists in cheese texture evaluation was carried out. Neither the untrained panelists nor the experts were familiar with either the scales or the kind of cheese. The same Cheddar cheese was given to 44 untrained subjects in three trials to assess hardness. The results showed that the…

Highly skilledInter-rater reliabilityRandom errorSignificant differenceVariance (accounting)PsychologySocial psychologySensory SystemsFood ScienceTest (assessment)Journal of Sensory Studies
researchProduct

Ensuring content validity of psychological and educational tests – the role of experts

2020

Many test developers try to ensure the content validity of their tests by having external experts review the items in terms of relevance, difficulty, clarity, and so on. Although this approach is widely accepted, a closer look reveals there are several pitfalls that need to be avoided if experts’ advice is to be truly helpful. First, I offer a classification of tasks experts are given by test developers as reported on in the literature dealing with procedures of drawing on experts’ advice. Second, I review a sample of reports on test development (N = 72) to identify the common current procedures for selecting and consulting experts. Results indicate that often the choice of experts seems to…

Inter-rater reliabilitylawMeta-analysisApplied psychologyCLARITYContent validityRelevance (law)Psychological testingSample (statistics)PsychologyEducationlaw.inventionTest (assessment)Frontline Learning Research
researchProduct

Assessing learners’ writing skills in a SLA study: Validating the rating process across tasks, scales and languages

2014

There is still relatively little research on how well the CEFR and similar holistic scales work when they are used to rate L2 texts. Using both multifaceted Rasch analyses and qualitative data from rater comments and interviews, the ratings obtained by using a CEFR-based writing scale and the Finnish National Core Curriculum scale for L2 writing were examined to validate the rating process used in the study of the linguistic basis of the CEFR in L2 Finnish and English. More specifically, we explored the quality of the ratings and the rating scales across different tasks and across the two languages. As the task is an integral part of the data-gathering procedure, the relationship of task p…

Linguistics and LanguageRasch modelrating processProcess (engineering)ta6121CEFR scalesNational curriculumLanguage and LinguisticsTask (project management)Inter-rater reliabilityL2 writingRating scalevalidointiItem response theoryFinno-Ugric languagesL2 learningtehtävätPsychologySocial psychologySocial Sciences (miscellaneous)Cognitive psychologyLanguage Testing
researchProduct

Intra- and Inter-Rater Reliability of Strength Measurements Using a Pull Hand-Held Dynamometer Fixed to the Examiner's Body and Comparison with Push …

2021

Hand held dynamometers (HHDs) are the most used method to measure strength in clinical sitting. There are two methods to realize the assessment: pull and push. The purpose of the present study was to evaluate the intra- and inter-rater reliability of a new measurement modality for pull HHD and to compare the inter-rater reliability and agreement of the measurements. Forty healthy subjects were evaluated by two assessors with different body composition and manual strength. Fifteen isometric tests were performed in two sessions with a one-week interval between them. Reliability was examined using the intra-class correlation (ICC) and the standard error of measurement (SEM). Agreement between …

Medicine (General)Clinical BiochemistryeducationpullIsometric exerciseupper limbSittingLower limbArticle03 medical and health sciences0302 clinical medicineR5-920Reliability (statistics)MathematicsOrthodontics030222 orthopedicsreliabilityDynamometerpushHand heldHealthy subjects030229 sport sciencesInter-rater reliabilityhand-held dynamometerlower limbstrengthDiagnostics (Basel, Switzerland)
researchProduct

Effects of Interrater Reliability of Psychopathologic Assessment on Power and Sample Size Calculations in Clinical Trials

2002

Although rater training is increasingly used to improve the quality of the investigated outcome parameters, the reliability of assessments is not perfect. Thus, empirical reliability estimates should be used instead of theoretically assumed perfect reliability. Implications of the reliability of psychiatric assessments for sample size and power calculations in clinical trials are presented. The theoretical basis of sample size and power calculations using empirical reliability scores is delineated. Examples from contemporary research on schizophrenia and depression are used to illustrate several implications for study design and interpretation of results. The tremendous impact of the lack o…

Observer VariationEstimationClinical Trials as TopicPsychopathologybusiness.industrymedia_common.quotation_subjectClinical trialPsychiatry and Mental healthPower analysisInter-rater reliabilitySample size determinationSample SizeStatisticsHumansPharmacology (medical)Quality (business)PsychologybusinessQuality assuranceReliability (statistics)media_commonJournal of Clinical Psychopharmacology
researchProduct