Inter-Rater Reliability
August 16, 2021 | By Auburn
Inter-rater-reliability (IRR) refers to the degree of agreement among independent observers who rate, code or assess the same phenomenon. A strong IRR is an important aspect of a proper methodology in scientific research. A project without a strong IRR means that trained practitioners cannot consistently agree, under blind conditions, on their typing or designation. When no agreement can be found among even trained practitioners, this casts doubt on the reliability and objectivity of the metric in question.

The IRR problem in typology

Jungian typology has long been plagued with the problem of low inter-rater-reliability. An individual typed by one Jungian analyst may be typed another way by a different analyst. Although there may be a loose agreement in the general terms used by two analysts, when rubber meets the road they will often contradict one another on a specific person's type, whether a celebrity or civilian.

This is one among many problems that caused Jungian typology's fall outside of academic rapport. Its inability to produce consistent results has lead to skepticism about the reality of the types altogether -- and not without good reason. If you receive a typing by an analyst, but that analyst cannot teach and reproduce his typing method to others, then the typing you receive amounts only to a private opinion-- rather than a fact about who you are, which can be independently confirmed.

Cognitive Typology and IRR

Cognitive Typology (CT) is capable of reproducing its typing methodology among practitioners with high levels of precision. This is because typings in CT are objectively performed, using a series of visual signals and audio cues that anyone has direct access to verifying or falsifying.

The signals are also described with a level of precision sufficient to measure the strength of that signal. This leads to an unprecedented level of inter-rater agreement among Jungian models, capable of producing 90%+ consensus among trained practitioners in blind experiments. A typing by a trained vultologist is not solely a private opinion, but a factual description about your body mannerisms that can be independently confirmed by other vultologists.

Public Experiments

To allow for full transparency, CT periodically performs blind experiments within our community forum, allowing members (newcomers and veterans alike) to test their reading abilities against CT's metrics. Samples are privately sent to each participant, and they submit results back to the host. At the end, the host compares the results of all participants. This process has refined over the years, but a record of these experiments can be seen below:
