In order for a patient to receive proper and appropriate health care, one requires error-free assessment of clinical measurements. For example, a diagnostic test that assesses whether an individual will be classified as having the disease or not having the disease needs to produce accurate and reliable results in order to ensure that an individual who needs treatment receives the correct therapy. Agreement and reliability studies aim to evaluate the accuracy and consistency of diagnostic tests or measurement tools. A model developed by Shoukri and Donner allows for the concurrent assessment of inter-rater (between rater) agreement and intra-rater (within rater) reliability, by incorporating two measurements per rater per subject.
The main purpose of this research was to develop methods for the maximum likelihood (ML) approach using the Shoukri-Donner model and compare those methods to the method of moments (MM) approach using Monte Carlo computer simulation studies. Little differences between ML and MM were observed in point estimation. In general, the MM Wald test and MM confidence interval (CI) performed better than any of the other methods. In fact, the goodness of fit (GOF) test and GOF CI (for both ML and MM) were shown to have high empirical type I errors and low coverage levels, respectively, for the inter-rater agreement parameter in some parameter combinations for the 3 parameter case and all considered parameter combinations in the 4 parameter case. Further investigation as to why there is poor performance with the GOF approach needs to be done before one could recommend this approach as a better alternative to the MM approach. Also, it does not appear that the ML approach is necessarily better than the MM approach. Lastly, extending this research to a more general 5 parameter model requires the resolution of several issues before it can be evaluated in point estimation, hypothesis testing, and CI construction.