Functional Capacity Examinations:  How reliable are they?
By: Michael McHenry, Gleason & McHenry, PLLC

Functional Capacity Examinations purport to be able to correctly determine the capacity of an injured individual’s ability to return to work. FCE’s are common place in many aspects of medical treatment for litigated claims. In the workers compensation arena many physicians order FCE’s on every claim and base any restrictions or limitations assigned exclusively on the results of the FCE. This, of course, begs the question as to whether FCE’s and the resulting outcomes are reliable and valid.

To date, there has been very little peer-reviewed evidence as to the validity and reliability of FCE’s. There are literally dozens of different types of FCE’s in the market place and virtually all of them report to be valid. Further, almost all of the different types of FCE’s allege in their marketing material to be thoroughly tested and found to be the most reliable on the market. Not all of the different tests can be the most reliable. A quick review on most of the information available on particular types of FCE’s reveals that most of the testing performed has been performed by the creators and sellers of the different models. This makes the results of these tests questionable at best.

One occupational therapist has taken it upon himself to study the validity and reliability of FCE’s.[1]  Ev Innes, of the University of Sydney, Australia, has done two studies of all independent published studies of the many different FCE’s in the market. Mr. Innes’s first study was published in 1999 with the help of a fellow therapist, Leon Straker. The article is entitled “Validity of Work-Related Assesments”, and can be found at After studying many of the systems and the independently published information on the systems the authors reached the following conclusions:

“As with reliability, most work-related assessments have limited evidence of validity. A number had insufficient evidence on which to base an assessment of the level of validity. Of those that had adequate evidence, validity ranged from poor to good.”

The conclusion continued: 

“There was, however, no instrument that demonstrated moderate to good validity in all areas. Very few work-related assessments were able to demonstrate adequate validity in more than one area, or with more than one study, even when contributory evidence was included. This highlights the need for further research to be conducted in this area. Test developers, clinicians and academics are strongly encouraged to continue investigating the validity of work-related assessments.”

It was further found that:

“The acceptance of work-related assessments on the basis of their longevity in the marketplace and clinic should not be assumed to equate with adequate validity.”[2]

Mr. Innes supplemented the 1999 publication with a 2006 publication entitled “Reliability and Validity of Functional Capacity Evaluations: An Update,” International Journal of Disability Management Research, 135-148;  In 2006, Mr. Innes reviewed all comprehensive reviews of FCE’s published from January 1998 until March 2006.  Mr. Innes attempted to analyze all of the most common FCE’s. Unfortunately, some of the more popular FCE’s in use had been subjected to such limited peer reviewed studies as to render Mr. Innes unable to study these programs.

The three (3) most popular FCE’s in use in north Mississippi had the following results:

  1.   (

Mr. Innes determined that there were insufficient peer-reviewed publications to correctly review the Blankenship FCE and therefore it was not reviewed.


Again, Mr. Innes determined that there were insufficient peer-reviewed publications             to correctly review the Matheson System FCE and therefore it was not reviewed.

  1.   Physical Work Performance Evaluation  (PWPE) (

Mr. Innes found that his test-retest reliability of nine tasks in the PWPE ranged from poor to substantial. The examination of mobility tasks (stair climbing, repetitive squatting, and walking) were the least reliable aspects of the test with only poor to moderate reliability. The results further showed that interrater reliability varied and was lower when determining subject participation, particularly in the dynamic strength and mobility sections of the FCE.[3]


Some of Mr. Innes’ more interesting conclusions are as follows:

  1. There was poor correlation between the results of the many different types of FCE’s.
  2. Evaluators must use extreme caution when determining maximal versus sub-maximal efforts.
  3. Predictive validity of the FCE’s is rarely investigated.
  4. There is a negligible correlation between FCE’s and psychological test, self report measures, and other aspects such as strength and aerobic capacity.
  5. And finally, longevity of an FCE in the marketplace does not equate to reliability or validity.

If your practice deals regularly with FCE’s, you are strongly encouraged to read these articles and the articles referenced therein. Many of the FCE’s on the market do not meet the criteria necessary to meet a well-drafted Daubert motion. Further, most physical therapists that perform FCE’s have little knowledge of the problems associated with the model they perform.         

[1] The article focuses on the work of Ev Innes and Leon Straker.  That stated, there are hundreds of articles on FCE’s in general.

[2] (Innes and Straker, 1998, 3637)

[3] (Innes 2006 135-148)