How do you know whether your findings are valid?

First and foremost I am going to provide you with a brief description of the term validity, because if you are anything like me the knowledge acquired from year one statistics has packed its bags and left me over the summer.

Validity is a criterion for evaluating the quality of any measurement procedure, the validity of a measurement procedure depends on how effectively the measurement process measures the variable is claims to measure. Validity is particularly important when using an operational definition to measure a hypothetical construct.  Operational definitions help us convert an abstract variable into a concrete entity.   For example, we are unable to measure intelligence directly; we cannot simply measure it with a ruler.  Instead our best attempt at measuring intelligence would be to measure intelligent behaviour, we can do this by giving participants IQ tests.  IQ tests measure intelligent behaviour by measuring responses to questions.  How valid is this method of measuring intelligence? Well, there are always concerns about the quality of operational definitions and the measurement they produce.  Hypothetical constructs are not physical entities and cannot be directly measured; therefore, the validity of measurements produced by the operational definitions will always be under scrutiny.

Fortunately, we can assess the validity of a measurement procedure.  Face, concurrent, predictive, construct, convergent and divergent validity are the six most commonly used definitions of validity and help us assess the validity of a measurement (Gravetter & Forzano, 2009).

Your findings may present themselves as valid when the measurement procedure superficially appears to measure what it claims to measure, this is face validity.  Face validity is the least scientific definition of validity, but is simple and time-efficient.  If there is something drastically wrong with your measurement procedure you will most probably be able to detect in on face value alone. However, it would be very un-scientific to assume that your measurement of a construct was valid just because it appeared to be. You may be inclined to think that your findings are valid if the scores you obtain using your measurement procedure are concurrent with the scores obtained using an already established procedure.  However, obtaining concurrent scores with an established procedure does not prove that your procedure measured the construct you intended to. The two measurement procedures may have both measured the same variable, but it may not have been the variable either procedure wanted to measure.  You may also deduce that your findings are valid when the scores you obtain from your measure accurately predict behaviour according to a theory; this is a demonstration of predictive validity.  If you obtain scores from your measurement procedure that behave in exactly the same way as variable, your findings have demonstrated construct validity according to Gravetter and Forzano (2009).  Convergent validity is demonstrated by a strong relationship between the scores obtained from two different methods of measuring the same construct.  And finally, divergent validity involves demonstrating that you are measuring one specific construct and not combining two different constructs in the same measurement process.

I have now presented you with six ways of assessing the validity of your measurement; face validity, concurrent validity, predictive validity, construct validity, convergent and divergent validity.  Some forms of validity are slightly dubious, for example, as I mentioned earlier you cannot infer that your results are valid just because they superficially appear valid.  In fact, you cannot ever prove that your findings are valid when measuring a hypothetical construct.  However, it is fair to assume that your findings are valid if you have demonstrated the six aforementioned definitions of validity.

Advertisements

8 thoughts on “How do you know whether your findings are valid?

  1. You have made a lot of good points again in your blog this week 🙂 but I thought of a study that helps to show how easy it is to diminish validity and also show that whilst a measure may be valid the study may not.

    Mary Ainsworth (1970’s) conducted an experiment in which she observed children in what she called the ‘Strange Situation’. She observed the interaction and relationship between a child and the caregiver and then the child and a stranger…
    Some people have argued that it was not a valid study as it only looks at the child’s relationship with the mother but does not take into account the child may have a different attachment, for instance with a grandparent (Lamb, 1977). The study also lacks ecological validity as the child was in an artificial environment, as both the mother and stranger were following a particular script.
    This experiment shows how easy it is to decrease studies validity even when it may seem realistic and uses a reliable measure. More children today are in nurseries than would have been at the time this study was conducted, times have changed and we therefore have to assume that children may have stronger attachments with people that are not their mother, which decreases the temporal validity of a study.

  2. This was a comprehensive and informative blog post that was very close to perfect. I would suggest that one could add the small, but important, point that before one can demonstrate any of the forms of validity, one must first show that the measure is reliable. As well as this, I would highlight the fact that construct validity is perhaps the most difficult of validity measures to demonstrate. It represents an ideal study wherein one could show that all previous findings related to the particular construct have been confirmed within the study. This is well-nigh impossible to complete. Take the study of aggression, which has been measured in relation to diet, family, upbringing, television, games and so on and so forth. One could never manage to demonstrate complete construct validity of aggression within a new study. But just because construct validity is impossible to show completely, demonstrating a certain degree of it can be convincing.

  3. I agree with ecstatic; this is a very well structured and informative scientific blog with the appropriate amount of detail and evidence to back up your info about validity. However one thing that I would think to make it perfect is a bit more information on how validity is actually measured due to error and how certain types of error can affect the findings such as systematic error and random error. Also maybe including more information as to how a “true score” cannot be found and that variability is established from the observed score with the true score plus any error.

  4. Pingback: TA: Blogs commented on in week 4 « So…what do you think?

  5. A very informative and detailed blog this week . I feel that you made some interesting points on validity and how it effects studies. I found this read particularly attention-grabbing as I wrote about reliability in my blog for week 3, and obviously reliability and validity work hand in hand, so it was nice to see the other side of the story! I like how you gave examples for 6 types of validly. The only thing I would say is to maybe give some more evidence for each example. For example Messick (1995) suggested that there are two threats to construct validity, which narrow the reliability of the method. http://knol.google.com/k/validity-in-empirical-social-science-research# Other examples of the like would give even more backing to an already very strong blog post 

    • I agree with your viewpoint that I did not provide enough evidence of each example. So, here it goes. T. R. Kosten et al. (1983) evaluated the concurrent validity of their addiction severity index (ASI), they found that the ASI sub-scales had good concurrent validity with self-report measures in the areas of psychological, social adjustment, legal, and employment problems. Despite the ASI showing only limited concurrent validity with drug abuse problems they concluded that the ASI could be a potentially important evaluation instrument. Concurrent validity is demonstrated when the scores obtained from one measure are the same as the scores obtained from another, better established procedure for measuring the same variable. Self-reports may not necessarily be a better established measure, but the scores obtained were concurrent with their new measurement scale, the ASI. This provided Kosten and colleagues with evidence that their measure is measuring what they intended to measure.

      For a closer inspection of the study – http://psycnet.apa.org/index.cfm?fa=search.displayRecord&uid=1984-05635-001
      Kosten, Thomas R.; Rounsaville, Bruce J.; Kleber, Herbert D
      Concurrent validity of the Addiction Severity Index.

  6. Pingback: Comments I’ve made for week 3 blogs (for TA) « Statistics for Tea

  7. Pingback: Comments for my TA (week 5) « psychmja1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s