Marc Chun, Ph.D. is a research scientist at the RAND Corporation's Council for Aid to Education. His primary areas of research are the sociology of education, higher education assessment and quality, and the social organization of knowledge production. Dr. Chun was the project manager for the Collegiate Learning Assessment feasibility study and scale-up. Other current research projects include a study of the scale-up of constructivist elementary school math curricula, and work to support higher education policy and reform in Qatar. He completed his graduate training in both sociology and education at Stanford University, and a postdoctoral fellowship at Columbia University.
We live in a “choose two” culture. Although we seek to accomplish things faster, better and cheaper, we can’t have all three at once. Ask a chef to come up with a meal quickly and cheaply, and it probably won't be that good; ask an architect to design a building of higher quality and in less time, and that won't come cheap; and ask NASA to create a new space probe that is better but to do so with limited funds, and it will take more time.
In a choose two culture, you can trade off these triple-constraints against one another. You will have to decrease one of them to keep the other two steady (or to improve them): for example, you can stay under budget and complete the work before the deadline by cutting corners, but this comes at the expense of quality. You can produce something of higher quality less expensively, but that will take longer. In the same vein, if one dimension is unbounded, you have tremendous flexibility. If you had an unlimited budget, you could do whatever you want as fast as possible; it will just cost a bundle.
One way to illustrate this dynamic is the "iron triangle" below, with each dimension at one of the triangle's vertices. Think of the default mode as the gray circle in the center -- the project will be completed at moderate speed, of moderate quality and at moderate cost.
Any movement towards a side of the triangle means sacrificing at least one of the dimensions. As noted in the example below, the red circle indicates that by "choosing two" to complete the task faster and cheaper, one must be willing to accept that the process might not necessarily be better.
In many ways, the same can be said of higher education assessment, and specifically for this discussion, assessment of undergraduate student learning. Before proceeding, it is important to define terms in this case:
What's faster? Requiring less time to collect data and to complete analyses.
What's cheaper? Necessitating fewer resources (money, staff, technology, and other materials) to collect the data and complete analyses.
What's better? Having overall higher quality assessment (admittedly the least straightforward of these terms, but here it is defined as the accuracy and authenticity of the indicators, and the scope of the assessment).
This three-way tug-of-war is a daily challenge for those conducting higher education assessment. Ask a campus administrator to assess student learning, and although the desire may be to have all three, it's really only fair to expect her to choose two – and she must often settle for just one.
As a heuristic, the traditional approaches this administrator has at her disposal can be organized into four basic families or groupings: (1) actuarial data; (2) ratings of institutional quality; (3) student surveys; and (4) direct measures of student learning. Each will be considered along the dimensions of the degree to which they are faster, better and/or cheaper.
Method 1: Actuarial Data
What are often seen as the most "objective" measures of higher education are the analyses based on “actuarial” data. These data include graduation rates, levels of endowment, student/faculty ratios, highest degrees earned by faculty members, selectivity ratios, admissions test scores of entering students, and levels of external research funding. Although not intrinsic to the data themselves, the way in which the analyses are conducted typically rely upon two central assumptions: that a "higher quality" institution has more resources (funding, faculty [which is defined as a higher percentage of any given cadre holding Ph.D.s], and students [those with high entrance examination scores]), and that students learn more at such "higher quality" institutions. As one might suspect, these assumptions are not universally accepted.
Researchers argue that the primary advantages of using actuarial data are that these data are relatively straightforward to collect, and the resulting statistics can be easily compared across institutions and over time. Indeed, actuarial data are arguably faster and cheaper to collect; given the highly systematized and standardized procedures, such projects capitalize on tremendous efficiencies of scale. However, although actuarial data have prima facie validity in objectively assessing higher education quality, it is not clear if such approaches are better, in that the tools often cannot tacitly measure student learning.
Method 2: Ratings of Institutional Quality
A second approach of higher education assessment is based on analyses of ratings and rankings of institutions. This has typically taken the form of surveying either or both college faculty and administrators and asking these “experts” to rate the quality of different institutions and their programs on a series of dimensions, again with the assumption that students learn more at such "higher quality" institutions. Further, the implicit logic here is that informed "experts" can best assess institutional quality.
Again, although such rankings are relatively faster and cheaper, concerns about ranking methodologies have raised doubts that this information is indeed better. Rating systems are susceptible to methodological concerns. For the rating systems published in popular periodicals, the weighting approaches are often editorial in nature and hard to defend on theoretical bases (e.g., they may weight student-faculty ratio as 20% of the total score and research productivity as 10%, without providing any empirical justification that student learning is affected by each variable in such proportions), and ratings can be highly sensitive to even minor weighting changes. Additionally, some variables may lack face validity: alumni giving is claimed to serve as a proxy for student satisfaction, when it can arguably be instead a function of effectiveness of the development office or the relative wealth of the students. Further, reputations change slowly, advantaging those that are slowly declining, and disadvantaging those that are rapidly improving.
Method 3: Student Surveys
A third approach used to assess institutions is based on self-reported student information. In contrast to the proxy data used in the actuarial approach and ranking data based on surveying faculty and administrators, these data are collected by asking students directly about their collegiate experiences, satisfaction, academic abilities, and educational and employment plans. Typically, individual institutions collect such data to gather feedback about their institution while national researchers collect data from a number of institutions in order to generate research on the effects of higher education in general.
Although such data likely aren't cheaper or faster than the first two methods discussed (it takes more resources and time to collect surveys from individual students), if one is seeking to understand student learning, these data are somewhat better; rather than using rough proxies, as is done with actuarial data or rating systems, surveys ask students directly about their learning. Further because many of the outcomes of interest cannot be empirically measured (e.g., attitudes and values), student self-report surveys are generally considered to be a better method to measure these phenomenon. However, a key issue in student surveys, as in all surveys, is the reliability of the self-reported data (particularly given the "desirability bias").
Additionally, although you may get better data from student surveys, this does not necessarily lead to better analyses. It may be problematic to determine the actual impact of any process variables. Moreover, the traditional positivistic approach often employed in such analyses assumes that individual aspects of the college experience can be studied atomistically, potentially seen as denying the holistic nature of student learning.
Method 4: Direct Assessments of Student Learning
A fourth approach to assess institutional quality is to measure student learning directly. Direct assessments of student learning are perhaps the least systematically used of the four methods discussed here, but have the greatest face validity (to assess what students have learned, assess, well, what students have learned). Direct assessment may involve analyzing course grades; administering standardized tests, performance tasks, or open-ended questions to assess general academic skills or subject matter knowledge; and obtaining data from evaluations of student projects or portfolios of student work. Researchers tend to agree that this approach is a valid measure of students’ abilities, but the use of one performance indicator may not be reliable. For example, a student may write an excellent term paper on one topic, but not on another, due to varying levels of motivation or interest in the topic. While such approaches arguably provide better data, they are generally neither faster nor cheaper.
The Four Approaches Compared
When it comes to understanding what students have actually learned in college the literature suggests that we are faced with a conundrum. There is general agreement that student learning is important and valued, but there is little (if any) agreement on how to assess said learning. Collection of actuarial data is commonly used because of the ease of data collection and the patina of scientific objectivity, but this approach equates quality with discrete, available, and (perhaps most significantly) easily measurable indicators of quality. Institutional rankings rely on a formula using actuarial data and ratings by informed experts, but these rankings are limited (and questionable) because they provide only an indirect measure of quality and because they tend to conflate quality and reputation. Student surveys use student perceptions of their learning, but research has shown that such measures may be problematic because they depend upon student self-evaluation; still, this research has been an important step in connecting student learning with educational quality. Finally, direct measures of student learning arguably have the greatest face validity with regard to assessing undergraduate education, but the literature indicates that there are numerous issues that complicate their implementation.
Another Methodological Note
An important note is that the temporal design of any study will also shape the faster/better/cheaper analysis. A cross-sectional snapshot using any of these data sources has the advantage of being faster, but may limit the ability to make claims about changes over time in student learning in any value-added way. To study college impact using a pre- and post-test model will arguably produce better analyses (in that having two time points has clear advantages over a solely retrospective survey design), but this will neither be cheaper nor faster. Tracking specific members of a particular cohort of students can be even more expensive, but can provide richer data about student growth.
So Which to Choose?
To return to our campus administrator, she most likely faces a particular version of this tug-of-war. She can meet two of the demands, but it's impossible to accomplish all three. Typically, she won't have unlimited time and she won't have unrestricted resources; to prepare for the looming accreditation visit or faculty committee meeting, she may have to sacrifice a better assessment in order to stay on schedule and under budget; this is perfectly understandable, and might make sense given the very real exigencies she faces. Further complicating the matter is, as noted before, the lack of a universal consensus about what "better" means. Without this common definition, the four assessment approaches may be considered to be of equivalent quality; thus, the ones that are fastest and cheapest will typically win out.
The point to keep in mind, however, is that it's important for all on campus to recognize these trade-offs. A campus that focuses exclusively on cost and schedule must be prepared for the fact that some compromises in quality will be made, in just the same way that a campus that focuses on having a high quality assessment in short order must be willing to put its money where its mouth is. The key is to identify what is most important to the campus: doing work faster, better or cheaper.
I argue here that in the best interest of our students and improving academic programs, however, campuses should place a high priority on better. Quality should never be negotiable. If one completes analyses cheaper and faster, but there are no meaningful conclusions that can be drawn, the entire exercise is rather worthless. Having better data, conducting better analyses, and coming to better conclusions should be the key. Although it may cost more and take more time, an unspoken fourth dimension I would add here is the return on investment; this will assuredly be higher with better assessment. Better assessments have the potential to ensure that, in the long run, better programmatic choices will be made, that our understanding of student learning will be deeper and more accurate, and that our ability to conduct more sophisticated and nuanced analyses will continue to advance. Until the day comes when we can choose three, perhaps we should instead start by choosing one.
The author gratefully acknowledges the helpful comments of Elizabeth McEneaney and Mary Rauner on an earlier draft of this essay, and feedback from Richard Hersh.