Adapting User Research Methodology for Behavior Analytic Instruction Design

by Bryan C. Tyner & Daniel M. Fienup
cycle diagramBehavior analysis has a great deal to contribute to a wide variety of industries outside of academia and clinical practice, including the technological industry. Technical skills such as programming and using relevant software and industry-standard research methodologies would facilitate graduates’ entry into these fields, and may be useful in behavior analytic research as well.

Software user research (UR) is a career where behavior analysts could excel. User researchers collect and analyze data to inform product development. A primary consideration in UR is usability1: a qualitative assessment of user behavioral variables regarding the ease of and experience using a product. Relevant user behavioral variables include accuracy upon first using a product, errors committed over time, and the duration it takes to complete tasks, among others. There is clearly opportunity for behavior analysts to contribute to this industry; however, despite our aptitude for such analyses, behavior analysts lack experience with industry-standard methods, practices, and lingo used in UR and other technological careers that could facilitate entry into those fields.

The Process of Iterative Development

One method common in user research is called iterative development. The Nielsen Norman Group (NN/g), a major consultation firm in UR, defines iterative development2 as:

…steady refinement of the design based on user testing and other evaluation methods. Typically, one would complete a design and note the problems several test users have using it. These problems would then be fixed in a new iteration which should again be tested to ensure that the ‘fixes’ did indeed solve the problems and to find any new usability problems introduced by the changed design.

In a sense, this is simply a more systematic approach to pilot testing. NN/g suggests beginning with a comparison between two or more competing product designs, a method called parallel testing. Parallel testing is essentially a between-groups design that is run in a relatively short time because of market time constraints. The results of parallel testing inform decisions regarding which of the two designs to publish and prompt immediate product modifications. Product development does not end with parallel testing; rather, subsequent testing and further usability improvements are made based on the superior initial design. Again, this might be akin to what behavior analytic researchers would do during pilot testing. Once a design is selected for production, iterative development is a linear-process model3 in which user performance and satisfaction is measured. These data are then analyzed to inform future revisions, and testing continues.

There are other similarities between iterative development and behavior analytic research. For one, iterative development is user-centric: the continuous data collection and incorporation of user feedback places the focus on the performance of individuals. Sample sizes are generally small, like many behavior analysis studies, both for limiting research cost and time investment, but also because once a design flaw is identified, it is more cost-effective to correct it before continuing testing. One rationale for this is that future testing can reveal whether changes to the software have actually improved performance. Despite the limitations of small sample size, collecting more data before revising the product wastes the potential to verify the efficacy of those revisions. NN/g describes these issues in terms of diminishing returns4. Each participant added to a sample is incrementally more likely to commit the same errors as has already been observed because their performance is affected by the same interface flaws as previous users. While observing multiple participants commit the same errors increases confidence in the validity of those observations, NN/g argues that sampling more than five participants increases the redundancy of your observations, renders research slower and more expensive, generates less frequent feedback for the engineers, and therefore the product is more likely to reach a stage at which correcting identified design flaws is far more difficult or impossible to correct.

Adapting User Research Methodology for Behavior Analytic Instruction Design

The focus of our own research is behavior analytic instruction design—applying what we know about learning processes and the acquisition of behavior to the development and improvement of instructional methods and materials.

Iterative development appears particularly amenable to the design and development of instruction materials. Usability research and instruction design share the goal of individualizing one’s experience and optimizing performance and experience.

Iterative development appears particularly amenable to the design and development of instruction materials. Usability research and instruction design share the goal of individualizing one’s experience and optimizing performance and experience. NN/g argues that no interface is designed without some initial usability flaw, an assumption that also applies when designing instruction materials. Instructions are typically created by experts in a content area; however, experts are not always the best teachers, and the effectiveness of instructions are limited by the developers’ biases, learning history, and assumptions about what makes teaching effective. For example, experts may not be able to accurately identify prerequisite knowledge, skills, and abilities to perform a task, ideal starting points, or the appropriate level of detail and rate with which instruction is delivered. Furthermore, even good instruction is likely to have room for improvement.

With these considerations in mind, we hypothesized that incorporating UR methodologies with instruction design would facilitate accurate and timely tutorial development as well as provide industry-relevant experience for our graduate researchers. We decided to experiment with iterative development in our next project refining a computer-based task analysis for creating a reversal design in Microsoft® Excel®. This project seemed like a particularly good fit because there are so many opportunities for error while constructing a graph—especially during initial skill acquisition—and low performance on one step often preempts correct responding on subsequent steps. Rapidly iterating new versions that each addressed observed learner errors seemed a potentially effective method of improving the task analysis.

In addition to improving the task analysis, we sought to measure the improvement to the task analysis during iterative testing. We had recently updated a task analysis developed for teaching reversal design construction in Excel 2007 to reflect process changes in the 2013 release of the software. During revisions, we identified instructional factors hypothesized to enhance instructional effectiveness. For example, we hypothesized that describing in the task analysis the performance criteria to be used to evaluate learners’ graphs would likely lead to better graphing. Similarly, we hypothesized that describing relevant cues to which learners should respond, such as the location and appearance of buttons and menu items within Excel, would enhance performance. However, these are empirical questions: the effects of these variables on learner performance are unknown. In other words, it was possible that a concise stepwise description of correct responses produces faster and more accurate performance than longer task analyses including these additional details.

These hypotheses presented two concept tutorial designs for a small-N parallel test. The first tutorial described only the steps to perform to complete the graph from start to finish. The second tutorial was identical except it was supplemented with descriptions of performance criteria and relevant cues within the Excel user interface. We tested these two designs on eight participants each, and found that those using the supplemented tutorial created the graph with significantly fewer errors than those using the stepwise tutorial.

For the next stage we adopted the linear-process model of iterative development and aimed to test at least one revision per week. We tested each revision of the task analysis on four to six participants, and reviewed the graphs they constructed immediately after completion. Researchers identified participant errors and analyzed the corresponding sections of the task analysis. Task analysis revisions focused on the most frequent participant errors and those that were particularly detrimental to overall performance, and the nature of revisions were documented. After four weeks, the task analysis had been revised and tested six times, including changes to nearly every section, but primarily focusing on the phase change lines and axes. Examples of revision types include the order and rate of instruction (i.e., the number of steps per section), providing descriptions of instructional goals, the structure of step description (i.e., using numerical lists), and aesthetic tutorial interface changes (i.e., font format and size, color and location of navigation buttons).

Many types of errors committed in the initial rounds of testing never reoccurred following revisions to corresponding sections. This provides anecdotal evidence of the effectiveness of those revisions. We say “anecdotal” because there was no clear experimental analysis. Conceptually, the linear-process model of iterative design is an ABCDEF design, because each of the six revisions represents a unique phase in which a different independent variable was introduced, but without treatment reversal, secondary baseline, or control group to evaluate experimental control. Therefore, one limitation of iterative development is a lack of experimental control. Observed performance improvements following version changes may be due to either the revisions or to extraneous variables, such as repeated testing or individual differences in participants such as preexisting skills or learning history. It would therefore be erroneous to make similar revisions on other products based on such isolated and unverified observations, at least without subsequent testing.

A major benefit of iterative development that we observed during this investigation is the identification of novel hypothetical independent variables for potential revision.

A major benefit of iterative development that we observed during this investigation is the identification of novel hypothetical independent variables for potential revision. Each iteration, testing, and analysis phase indicated numerous potential revisions we could make to further enhance learner performance. When we began revising this task analysis, we had identified two independent variables (performance criteria and relevant cues) that could potentially improve learner graphing performance. After six iterations, we had identified eight variables within task analysis instruction that could each lead to its own experimental analysis, and many of these variables have received little to no attention in the behavior analytic literature. Therefore, a linear-process of instructional material development may have great value for brainstorming new variables for manipulation.

Behavior Analysts in User Research

Jobs in UR tend to be filled by employees with much different skill sets than many behavior analysts possess, such as computer programming and using web analytic and image editing software. Doctoral education has been criticized5 for the narrow scope of student training and lack of practical skill development, and therefore organizations may tend to prefer industry experience over PhD graduates. Nonacademic applicants for tech jobs may also be better able to demonstrate their relevant skills via portfolio than PhD candidates can demonstrate their mastery of research methodology and theory. This is not to say that an understanding of topics such as stimulus control or research design aren’t valuable for tech developers, but that applicants proficient at programming or using Photoshop™ can easily point to programs, websites, and images they’ve created, while PhD graduates have transcripts, resumes, and published research articles that application committees might undervalue. Graduate students interested in careers in technology face the challenge of acquiring relevant skills independently, such as by taking supplemental technical courses concurrently with their required graduate studies. Given some flexibility in graduate requirements and research area, however, it may be possible to intersect academic research with some of the tools and methods relevant in industry. Intersecting industry-relevant and behavior analytic methods may enable educators to deliver an accredited education simultaneously with more readily generalizable and marketable technical skills.

Mastering iterative testing alone may not affect one’s success at transitioning into industry; however, most industries use an array of methods, ranging from simple to complex, subjective to objective, qualitative to quantitative. Graduate students intending to enter a specific industry can identify relevant skill sets and methodologies for investigation by reviewing job postings and subscribing to the online publications of industry leaders such as NN/g. Our behavioral training and research experience enable us to evaluate their methodologies and make informed decisions regarding their validity, effectiveness, and efficiency. We also possess the skills to adapt and apply these methods as we would apply any other method we encounter in textbooks and journal articles. Such interdisciplinary practice not only benefits the student by providing specific industry-relevant research experience, but may also benefit behavior analysis by trialing new methods of data collection and analysis.


In order of citation:

  1. Nielsen, J. (2012). Usability 101: Introduction to Usability. Accessed October 2014 at:
  2. Nielsen, J. (1993). Iterative User Interface Design. Accessed October 2014 at:
  3. Nielsen, J. (2011). Parallel & Iterative Design + Competitive Testing = High Usability. Accessed October 2014 at:
  4. Nielsen, J. (2000). Why You Only Need to Test with 5 Users. Accessed October 2014 at:
  5. Nerad, M. (2004). The PhD in the US: criticisms, facts, and remedies. Higher Education Policy, 17, 183-189. Accessed October 2014 at:

Bryan C. Tyner received his Bachelor’s in psychology from the University of Nevada, Reno and his Master’s in psychology from Queens College. He is currently completing his Ph.D. in behavior analysis at the CUNY Graduate Center. He researches empirical instruction design, and his professional interests include user research, information architecture, and data visualization.

Daniel M. Fienup received his Master’s in behavior analysis from Southern Illinois University and Ph.D. in school psychology from Illinois State University. He is a board certified behavior analyst and licensed behavior analyst in New York state. His research evaluates the effectiveness and efficiency of behavior analytic instruction.



  1. Unfortunately, we didn’t collect any video data, but that’s a good idea for the future. It would be interesting to scrutinize computer-based instruction with usability research software like Morae or UserZoom. I imagine learner feedback collected in real time as they complete the tutorial would be invaluable.


  2. Do you have any video of that? I’d love to find out some additional information.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: