All posts

UX Research Topics


Field Guide


Thank you! You are all signed up.
Oops! Something went wrong while submitting the form.


BlogAwkward Silences

UX Benchmarking to Demonstrate ROI with Kate Moran, UX Specialist at Nielsen Norman Group

In our first ever live podcast episode, we chat with Kate Moran about how to demonstrate the ROI of your UX work

Carrie Boyd

UX benchmarking may seem like a lot of work, but Kate Moran is here to show you how to do it effectively. She's a UX Specialist at Neilsen Norman Group and leads UX teams to better benchmarking, teaches newbies how to get started, and explains this complicated subject with clarity. She joined Erin and JH on our very first live episode to explain how UX benchmarking can help teams show the ROI of their work.

She walked through how benchmarking can help get stakeholders on board, how to choose the right metrics early on, and most importantly, how to translate that to real ROI.

Our very first live podcast was a great learning experience and a ton of fun! We really enjoyed the interactive aspect, and our audience asked a lot of thoughtful questions.


[2:01] Kate explains what UX benchmarking is

[3:37] How to choose benchmarking metrics

[12:01] The difference between summative and formative studies, and why you need to distinguish between them.

[17:21] Why context matters when evaluating benchmarking metrics

[21:28] How to translate benchmarking results to ROI

[29:11] Kate talks about case studies from NNg's ROI for Usability report

[30:51] Creative ways to benchmark

[35:34] Q&A - How do you limit bias in unmoderated studies with non-users and users?

[38:16] Q&A - How do you measure time spent on a task? Stopwatches aren't great.

[39:59] Q&A - How do session replay tools fit into this?

[41:10] Q&A - What happens when your stakeholders have different metrics for success?

[44:57] Q&A - If a participant thinks they completed a task successfully, is that a success?

[46:42] Q&A - How do you benchmark for emotional aspects, like how fun a product is?

[49:07] Parting words of wisdom


Video Replay

The best stories about user research

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Kate's recommended resources

Case study submission
Want to be featured in an NN/g report? We're collecting case studies about UX benchmarking. In other words, we want to hear about a specific time when your team implemented some kind of design change and collected quantitative data before and after that change.

We're offering free reports and full-day training courses as thank-you gifts. Reach out to us at, or check out the submission form at

Related resources


Erin: [00:00:35] hello everybody, and welcome back to awkward silences. We're here today with cohost JH Forrester, as well as Kate Moran. She is a senior UX specialist that Nielsen Norman group. Today we're going to talk about UX benchmarking and the ROI of UX. We know that something that, is really on the minds of a lot of people these days.

When you think about. Just having to be really smart about where you spend your dollars and your time. so we are so, so happy to have you here today, Kate.

Kate: [00:01:08] Thanks for having me here Erin and JH. It's nice to meet you guys virtually.

Erin: [00:01:13] Yes. The only way these days.

JH: [00:01:15] Yeah, I'm excited. I feel like I say this most episodes, but I'm a good part of doing this podcast is I get to learn about topics that I don't know a ton about and UX benchmarking definitely falls into that category for me. So I'm excited to learn from you today.

Kate: [00:01:27] Cool. Yeah. I think that's, that's pretty common. even. Very experienced UX researchers. A lot of them really focused on qualitative. So this kind of quantitative side is sort of foreign to them. So, that's something that I'm hoping this will do is kind of get the word out about the benefits of quantitative data for us.

Erin: [00:01:47] awesome. So let's just jump right in. What is UX benchmarking and why do teams need it?

Kate: [00:01:54] Yeah. So there's a little bit of debate on the exact definition of benchmarking, but to me, benchmarking at its core is using quantitative numerical data. usually I refer to that as like a UX metric. so using that quantitative data to. Assess the user experience of a product or service. And the way that you assess it is by comparing that number or usually a set of numbers.

You compare that against some meaningful reference points, some kind of standard. So you might compare your numbers from, you know, this year's version of the product to the numbers from last year's version of Or you might compare your numbers to competitors numbers. Or you might compare your numbers to like the average across an industry.

So like if you have an eCommerce site, you might look at the, you know, the average, cart abandonment rate, for your specific type of eCommerce site, for example. So that, that part of it is really key. if you're collecting those numbers and you don't have anything to compare it against, then how are you going to know if that's good or bad or how are you going to make sense of it?

JH: [00:03:04] Hmm. And how do you, how do you, when you get that number to compare against, how do you know that it's like a fair comparison, right? So like, I feel like if I were to bring up, Hey, our cart abandonment rate is 30% and the industry average is 20%. I'm justmaking up numbers. And I shared that with somebody on the team. I feel like the first thing I would get would be like, well, like what wedo is very different. Our prices are much higher. Like you can't compare. Like, so how do you find a comparison that people like take stock in

Kate: [00:03:30] Yeah, that's a great point. So if you're thinking about something like an industry standard, which confusingly is sometimes referred to as an industry benchmark, so if you're comparing against like an industry average, that is one of the objections you might make is like, you know what, we're a little bit of a special case, so it's not really fair to compare.

So maybe then you don't want to compare against that, that average of the industry, but you want to compare against a specific competitor who is specifically in your same category. And so then you do feel like it's a fair comparison or you want to compare it to an earlier version of your product. But even then, we always have to be asking that question, like, is this a reasonable thing to compare against, especially with Corona virus?

So. A major method that a lot of teams rely on for benchmarking is analytics data. So now we're in a situation where like, okay, now your entire customer base, your entire user base is probably going to have very different behavior patterns than they did before this hit. So it's going to be hard for you to isolate those variables and really see that the change that's happening is due to your design.

and in some cases we kind of have to just. Live with that imperfectness like you, you would love to have a perfectly controlled experiment where the only thing that's changing is your design, but especially with something like analytics, that's not always possible.

Erin: [00:04:54] Yeah. You mentioned kind of, you know, you want to pick some numbers and look at them and see how they're changing. we did a recent episode about, big data and how much data. There, you know, too much data arguably out there. So if you're a UX team or UX person or a person doing UX design, where do you start when choosing which metrics might I want to focus my benchmarking on?

Kate: [00:05:20] Yeah, that's a huge question. and that's, that's something that a lot of teams are trying to work through, especially when they get started with a benchmarking practice. the tricky thing when you're, when you're just getting started, so if you haven't been benchmarking your product or your service until now and you're saying, okay, I want to start my practice now, and I kind of refer to it like a practice cause it is something that's like, that's ongoing over time.

Ideally, the tricky part is you have to try to pick metrics that are going to be meaningful years from now. Like you wouldn't want to pick a bunch of metrics that are related to some flashy new feature that you just added that everybody's excited about. But it doesn't have anything to do with the purpose of the product.

That is kind of the trick is you have to ask like, what is this product's reason for being? Like why does it exist? And then, and then to choose metrics, you really have to think about your goals. Like are we going to be doing this benchmarking so that the design team can see our improvement over time? If that's the case, we really want to think about like what are those?

I call those, like the UX metrics. We want to think about the metrics that tell us. The most important things about what's happening, and this is something a lot of, a lot of times people struggle with because this idea of putting a number on the user experience seems counterintuitive because we talk about it being so like personal and kind of nebulous.

And so the way that I like to think about it is you're not really putting a number on the entire experience. You're putting a number on some aspect of it. So that could be something like, you know, how long on average does it take people to do this really important critical task? That's kind of at the core of what we do.

it could be something like, you know, how, how likely are people to recommend our product? Could be something like an NPS, the net promoter score. you really have to look at what are the core values of, of this product or the service. Like why does it exist? And then also what are the business goals.

So you also want to have at least a couple of metrics that can kind of track with those business goals because that's really going to help you calculate return on investment. So kind of we want to balance are the things that are interesting to the design team, and then what are also the things that are interesting to the business.

JH: [00:07:31] Is it something when a team's getting into this that they'd be better off, like picking one kind of benchmark metric or UX metric starting there and then adding more to that as they cover more? Or is it better to kind of like cast a little bit of a wider net right. The start, so that as you start to feel which ones are really relevant or useful to the organization, you kind of have those and now you have the history, if that makes sense.

Kate: [00:07:51] Yeah. It kind of depends a lot on what method you're going to use to collect the data. And so I mentioned that analytics is a, is a big one that a lot of people really rely on because it's tends to be a lot cheaper, a lot easier. So it's something like analytics. It probably is kind of easy for you to go in there and pull out metrics, kind of as you think of them as you decide that they're going to be meaningful.

But if you're using another. another methodology is something like quantitative usability testing. we usually recommend that for quantitative usability testing, which is distinct from qualitative. And sometimes people don't understand that if you are running a study for the purpose of collecting benchmarks, I would call that a quantitative usability study.

And you need a lot of participants. So a lot of times we advise people, if you're doing qualitative, you can get away with anywhere from five to 10 participants, potentially. Because in a qualitative study, we want to find out what's wrong with this thing, what's not working, and we want to get ideas for how to fix it.

With a quantitative study, we need to know we were trying to get a random sample that's going to represent an entire group of people. So we want to have some amount of accuracy there. so for those kinds of studies, we usually recommend having about 40 participants. And so that represents a lot of costs and a lot of effort on the part of the researchers.

That's why remote unmoderated testing is really popular for that kind of study. But so anyway, if you were, if you were planning a quantitative usability test, then you do kind of need to sit and think about like, what are the metrics? What are the most important tasks we should be running? What are the metrics that we should be collecting for this?

 Because you want to create that baseline dataset. That when you, have a new iteration of this product or this service and you run your next round of the study, then you have a clean, you can like cleanly compare those metrics. So a lot of times what I'll recommend doing is getting together with your stakeholders for the project, maybe your clients, and talking about what are the different metrics that we could use to represent success.

And a lot of times I find that like that kind of question has not formally been asked, especially not on the UX team, like maybe is thinking about that for the entire business, but the people responsible for design haven't had that conversation. So like making sure that there's some sort of an alignment or an agreement across the people who have the most stake in the product about should be and how they should be measured.

Erin: [00:10:19] Absolutely. And it's probably a good thing to start thinking about as you're launching new products that might introduce really new goals, new user goals, new business goals, and hence, you know, new things you'd want to benchmark against to kind of get that baseline early, start tracking that data early so that you can then, you know, look at that over the lifetime of the product.

Kate: [00:10:41] Yeah, that's a huge kind of aspect to this is that it does, it requires a lot of longterm planning and forethought. it really stinks when I hear this a lot from the agency teams that I help. I hear them say like, Oh, we got to the end of this project, and then we realized we should have measured XYZ with the original version and we didn't.

And those numbers are no longer accessible, so, so we just don't have that data. But you know, maybe we should back up a little bit and talk more about like these benefits of, of benchmarking. Cause I think a lot of people see the difficulty in doing something like a quantitative usability study and they say why would I do that?

and that's a valid question because qualitative research is. So useful in UX. Like that's why we rely on it so heavily because, it answers a lot of the kinds of questions that we have. Like I said, it tells us, you know, what isn't working? us understand how our target users think and behave and how they speak, like what language they use.

So all those little pieces are so useful. And that's why qualitative tends to make up, I find the biggest part of the UX research toolkit. So then why would you want to do a quantitative study? so to kind of answer that question, I want to draw a distinction between summative evaluations and formative evaluations.

And I realize this is going to sound kind of pedantic, but the difference is actually important. So with a, like a, go ahead and you have a question to Jake.

JH: [00:12:05] Oh, no, I'm just saying. Yeah. Let's,

Erin: [00:12:07] Let's get

academic here for a second. Sounds 


Kate: [00:12:09] get academic. Okay. so with a formative study, that is the type of research that we do all the time in UX. A formative evaluation is focused on trying to understand how should this thing take shape.

So that's an easy way. To remember that one formative form. How should we form whatever this thing is we're designing. So if I run a 5% qualitative study to help me understand what, what the issues are, if I run, you know, a set of qualitative interviews, if I even do an AB test, which is quantitative, but in that case, I'm looking to get information about which of these different design alternatives.

Performance better. All of those are examples of a formative study. I'm kind of like in progress building this thing, and I want to check in with, with my users and bring some research into it to help me, this thing take shape. So that's formative. Summative is different, and we can think about it like, you know, if you think back to, like grade school, a, it's like a final exam.

So at that point you've already taken the course. Right? You can't go back and change your learning like it's already taken place. And so what we want to do with a final exam is sort of assess how that went. And it's a similar concept in design us, you know, our designs are never finished. They're always iteratively changing, but you do reach end points in sort of the design cycle.

So as you reach that end point, you look back and say, okay, compared to where we were at the start of this cycle. How much have we improved? And so that is a summit of study. So it's kind of, it's not so much to get information about how to change the thing, but it's more, you can think about it like a, like a snapshot in time.

Like how are we doing right now? And how were we doing back then? So that's one of the big benefits is it's a different kind of evaluation. Benchmarking is a different kind of evaluation and it gives you. kind of a clean, concrete way to track over time to see that you're improving.

Erin: [00:14:09] but I imagine that might lead to you wanting to make some changes, right? If we say, Oh, this like huge metric we're tracking is taking, you know, a nosedive, that might be, let's get a taskforce on that right away. Right.

Kate: [00:14:21] Yeah, definitely. So the other thing, and I know this is my kind of confused things. but yeah. I always find that like whether you're running  a quantitative usability study, let's say, just because your priority is to get the metrics doesn't mean you're not.

Also, it doesn't mean you won't also get like insights. It doesn't mean you won't also, you know, find out why something isn't working. Like just by observing people. so you definitely can use it that way, the really, the big differences, like you're going to set up  study in a different way than you would set up a formative study.

JH: [00:14:55] What's some, it helps me have an example. So like when I think of the

benchmarking metrics we talked about, like

of ones that

you'd get through the analytics tools you mentioned, which are kind of cheap and easy of, you know, how long did this take, or what's the abandonment rate or conversion rate from something?

What's like a type of a benchmark metric that you'd be better off getting through a quantitative user research


Kate: [00:15:13] that's a 

great question. So with quantitative usability testing, there's really three things. There's, there's more than this, but the three that I find are most useful and most situations are time on task. So that's a big one. that you sort of can get from analytics. Like you can look in your analytics tool and find things like, you know, the average time per session or the average time on page.

But that's a really different evaluation of that thing cause we don't know what else people were doing at that time. With a quantitative, yeah. With a study. It's like I'm creating an experimental condition where I'm telling you to try to do this task. Now, it should be a realistic task you would really do.

so that's going to be kind of a cleaner measurement. Like you're getting rid of a lot of the noise, that you would  find from your analytics tool. So that's one big reason you might run. 

Erin: [00:15:59] Yeah.

I was just gonna say it's the idea with the time to complete task is the idea that like less is always better or, you know that,

Kate: [00:16:07] Mostly, yes. So yeah, and that's something like, again, that's a little bit different in like marketing's interpretation sometimes of of analytics, like traditional marketing's interpretation of it, where we think about like, okay, we want people to engage more. We want this product to be sticky. want them to spend a lot of time on the page.

That might be true if you have an entertainment product. Like if you, you know, if you're designing a mobile video game, you know, definitely, then you want to see more time. But for most things, people just want to get their task done and move on with their lives. You know, spending, spending more time on a task is usually not a good thing.

JH: [00:16:46] Yeah. A actually interesting example, I was talking to a friend who works at Spotify the other day, and he was saying they have a real internal debate around skips about whether skips are good or bad, because. If it's a discovery playlist, skips are assigning to engagement because the person's listening

in, like telling you, I don't like this one.

Whereas if it's like, you know, a playlist around

artists that are familiar with skips are bad 

because you're not playing them stuff that they like. And it's like, depending on the context, whether that is good or bad as like, so, you know, very so much and, 

kind of relates to this like faster is better, right?


Kate: [00:17:14] Yeah,   definitely. It all has to be grounded in the context. I think that's something where we keep coming back to, and it's, it's really important, like how you pick the metrics, how you interpret the metrics, how you like make sense of whether or not this is good or bad. So time on task is one that you would want to get from quantitative usability testing.

Also success rates. So whether or not people can successfully complete a task, and that's another one where like you might say like, can I get that from analytics so you can get completion rate. From analytics. So, but then again, you don't know, like if somebody drops out of that, that funnel, if they drop out of that process, was it because you know, you have really poorly designed this, this, you know, onboarding flow and it doesn't make sense to people, or was it because something else happened?

You know, just like random noise, like random life things could happen to interrupt them. So again, you get cleaner, more focused ways of measuring this. and then the third thing you typically get from quantitative usability testing is a satisfaction score of some sorts. So you can do things like, you can ask people how easy or difficult was this task to complete?

Or, how satisfied are you overall, which again, is something you could get with an online survey. But then with an online survey, you may not know how much exposure to the product they've had at the time that you asked the question with the quantitative usability tests. I know they just tried to complete the onboarding process for this, this platform.

And so I know what the context was before they responded to that question. So that is the benefit of quantitative usability testing is it is a little more  focused that way.

JH: [00:19:25] Awesome. That makes sense. The, um, the thing that reminds me of is like, the way it helps you

control for the environment is like in, you know, like college psychology departments when they have people running various studies and like having students come in and there's problems with that, right? Cause it's a very biased audience, but, they are doing something kind of similar of like.

You're trying to just limit distractions and say like, let's try to quantify whatever this outcome is. And, that kind of is what's coming to mind for me. There's, we do have a question here. So we have, from the audience STAM, writes in, if we have all these snapshots, and one of the previous snapshots, you know, perform better, what do you do in that situation?

Do you just try to roll back to the previous version or, you know, how do you locate what is actually driving that 


Kate: [00:20:03] So that's a great question. Stamps. So one of, one of, I think that kind of touches on a theme that is really important is that quantitative data can never replace qualitative data. We talked about all the reasons why qualitative data tends to be the main tool a lot of UX researchers use. and that's for good reason.

Like, if you want to know why is, why is our new version performing worse than our previous version, which, you know, that happens sometimes we expect, we expect people will respond to something in a certain way. And then they just don't. So the best way to figure that out is to use your qualitative research tools, not necessarily to roll back the entire thing.

That's especially true. Like you can have a different cadence. For how you run these summit of benchmarking studies. And it really just depends on kind of, you know, your, your project cadence and kind of what you're using benchmarking for. But you could do it after, you know, every major iteration of the product.

You could do it every year. You could do it every other year. So if you are in a situation where like, okay, between 2019 and 2020. We changed all these different things, like all of these different details in the design. You're not going to be able to pinpoint just on that quantitative data, probably what the key problem was.

So you're, you're always going to have to turn to that qualitative research there.

Erin: [00:21:18] I know. I know. One of the things we wanted to talk about was, you know, UX benchmarking in the context of ROI, right? And when we, when we talk about ROI, we're talking about money, we're talking about numbers. We're talking about quantitatively minded people who, you know, care about these sorts of things.

And, and so of course, quantitative data is going to be helpful for talking about ROI. as we talk about, you know, how benchmarking can be seen and can be in reality, quite sort of time intensive, resource intensive capital intensive. You know, you're talking about, well, you could do it once a year. You could do it that, you know, several times a year as you make changes to the product.

How do you think about the ROI of the benchmarking itself of when is it worth the time and the effort to do it, versus to maybe, you know, do instead?

Kate: [00:22:08] Yeah. so that's a great question, and it's definitely true that you. Should view this as a, as a tool, and you should use it if you have a need for it and not use it if you don't. So maybe we should back up just a minute and talk about why benchmarking is good for calculating return on investment. So with, with benchmarking, the nice thing about benchmarking is that at the end of these design changes, I get these specific concrete improvements.

So I can say we reduced calls to our customer support. Center for this specific task by, you know, by 95%. We, decreased time that we cut our time on task for this critical task in half. Like you can give those specific numerical changes or relative changes. And that's something that tends to resonate really well with people in leadership positions.

You know, we could speculate about why that is, but I think a lot of it has to do with, you know, people who end up in those positions often come from business school. And so that's something that's kind of ingrained. It's sort of part of that philosophy. And, so being able to numerically say, this is what you're getting.

As a result of the design changes we did. That is something that's inherent in any kind of benchmarking. So that's one reason you might want to do it, but then you can take that a step further. So I could do something like, so let's say that I redesign an intranet and I reduce time on task for this really critical, frequent task that I know my employees are constantly doing.

So I reduced that by some amount. my employees are being paid hourly. I know how many of them I have, so I can, it's pretty simple math. I can do the math and figure out what amount of cost savings does this translate to for intranet for the company this year and sort of on an ongoing way.

And then I can take that even a step further and say, so let's compare that to the costs of that redesign project as well. So that's something, Erin, you could definitely include in there. You could say, so this was the cost of the, of the, the discovery phase that we did. These are the, the development and design costs involved with this redesign.

And also, here's the cost from the quantitative study we had to do to even get the numbers that we're talking about right now. So you could, you could include all of those things and. Say, you know, here, here's what you're getting. And that is something that's really powerful and tends to resonate with leadership.

So that's great for getting buy-in. And that's something that almost every UX professional that I talked to once more of almost everybody that comes to our, our conferences that NNG does, or anybody that I, you know, that I interview or all the teams that I help, they all say like, we wish we had a little more, you know, street cred in our organization.

We wish we had a little more like. Support. and that all tends to come down to like, are you actually showing the business side that you're delivering, that you're actually giving value? McKinsey just did a survey, I think it was last year, and they surveyed a lot of design teams and they found that more than 50% of those design teams that they talked to had no way of doing this.

They had no way of like setting a specific numerical target and then showing that they met that target. Which that's not good for the design team, but it's really not good for the relationship between design and business because if you're in leadership, like why would you, why would you want to keep throwing money at something you're not totally sure is actually having the impact you want.

Erin: [00:25:39] And it's not that you need every designer or creative thinking about dollars and cents all the time, right. It's that you need to kind of, in a way, freedom to think about what they're good at by showing that that work has value.

Kate: [00:25:54] exactly. Yeah. So this is not something that everyone needs to worry about. I think kind of getting back to your earlier question, like how do you decide whether it's worth your time and you know, you can, you can choose a methodology like analytics or even online survey is, you can also get these kinds of metrics.

Without doing any user research, like go talk to your customer, customer support people, or look at your performance metrics. you can, you can find ways to do this cheaply. But yeah, this may not be something that every UX researcher, every UX professional needs to think about. But I do think that if you're a UX lead, if you are, you know, leading a research team, then this is something.

You should consider, particularly if that situation I just described where you know, a lot of people feel like we don't really get credit for the work that we do. We don't really resources that we need to do UX the way that we want to do it, that we think it should be done. If that's your situation, then maybe this is something you should think about.

But if you're already working in a, in a high UX maturity organization where like UX is baked  design is baked in from the top down. Then, yeah. This may be not worth your time. It's just a tool. You've got to use it if you need it.

JH: [00:27:09] And is that something where if you're going to do that, like pre-post, you know, show the value, like in a perfect world, you would kind of take a snapshot or do a benchmark right before a large change and then kind of right afterwards so you can kind of try to isolate some of that. Is that like the gold standard in terms of how to try to get that difference?

Kate: [00:27:25] It is, but it depends. It's like exactly how you do that. Depends a lot on the methods. So like we talked about analytics, how coronavirus is messing up everybody, everybody's analytics data. But. It's messing up everything, including analytics data. one of the, one of the issues you have to think about is like, there are all these natural fluctuations in your user traffic.

Like everybody has these, like Nielsen Norman group, for example, we publish articles that come out in our newsletter on Monday. So we have a big spike in traffic on Monday, and then that tapers off. Like, we don't get a ton of people. Reading UX articles on Saturdays and Sundays, which is fine. but then, you know, if you're a university website, like, you know, I've helped university websites where they find that their traffic is very different, like different behavior patterns, different users during the summer versus the fall.

So you've got to know those, that sort of fluctuation. You've got to know those patterns and think about that. Like maybe to your question, Jake, maybe it's not the right timing to do it exactly before and after, like maybe. If you have those seasonal changes, you've got to collect that metric in may of 2019 and then in may of 2020 so it's like a really a fair comparison.

JH: [00:28:36] That makes sense. Yeah. There's, there's a lot to account for on this, it seems like, huh? 

Erin: [00:28:41] funny, not really the point of what we were saying, but when, you know, we've noticed we're actually getting more traffic to our content on the weekends because of coronavirus. So,


Kate: [00:28:50] Yeah, maybe. Maybe that's 

Erin: [00:28:51] Yeah. So it's, it's been, it's been interesting. so I know that you're working on a report that you've put out at Nielsen Norman group a few times, and you've got it tentatively coming out again in the fall.

This really robust, report of case studies of how folks are using UX benchmarking and, to, to show the ROI of their work. I'm curious if you have any. Sort of greatest hits or, case studies that you'd like to

share. For the folks listening, where, where teams have really used the, these techniques to, to show great impact. 

Kate: [00:29:23] Yeah. So for a little bit of context there. So it's a report that is currently  ROI for usability, but I think we'll probably be changing that, that title when it comes out in the fall. And this will actually be our fifth edition we're sort of constantly updating it. But this is a really big update. And essentially it's a collection of all of these different examples of companies doing this work in real life. So actually benchmarking, actually calculating ROI, for their design work. And so some of those are like positive outcomes, like things, you know, everything happened the way we wanted it to happen.

And you know, our metrics look great. Yay. And some of them are like. We thought this was good, going to go a different way. It didn't actually turn out how we wanted, but here's what we learned from it. so that's what the  of a collection of those case studies. And then we sort of look across those case studies to see are there any themes, like if you're redesigning in the specific context, trying to.

 Improve your conversion rate, for example, what are realistic percentages that you could expect to find? So that's kind of the point of the report. And so we've just started collecting those case studies. And for everybody listening, I am still collecting those. So I would love if you, if you've done this before it, I'd love to hear about that.

 I recently got actually a couple different case studies from market aid and I actually found them through your podcast. So I listened to the episode that you guys did with Sonya, and ended up, you know, I hadn't heard of them before and ended up checking them out and I was really impressed because they, they seem like.

A very rare UX team that actually does a lot of quantitative work and they're performing the methodologies correctly, which is also unfortunately  find. so that is kind of as an aside, that is one thing, like if you're listening to this and you're like, this is something I want to start doing, I want to do more benchmarking, a good place to start would be making sure that grounding in understanding quantitative methodologies.

And also some basic, at least basic statistical concepts, But, so I was talking to market aid and they told me about a couple of different case studies, but one of them was really interesting. They were working on this B to B website for this company that does like metalworking and woodworking machinery.

So like very specific sort of esoteric topic. And they did a lot of user research with them, but this one thing they did with them I think was really interesting. They went through some number, I want to say it's like a hundred of  their most frequently searched queries on the site search for this B2B website.

And so then they worked with a subject matter expert and have them look at the results. So they just basically went through each of these common queries on the site search and they just quantified like how many times. When these searches are happening, do you get completely irrelevant results? So that was something, it's not coming from analytics.

It's not coming from quantitative usability testing, but it is a metric. So like they ended up finding, before they worked with them to redesign how the site search worked. It was something like 50% of of searches resulted in a relevant result. Which is pretty bad. So, I think that's a great example of how like we talk about this as like time on tasks, conversion rate, all those things.

Those are like the basic metrics that you might use, but you can apply this approach to almost anything. Like if you think outside of the box, you can really get creative with another case study, that I that I heard about was, um. It wasn't actually about a product. So this was from a UX lead who wanted to convince the leadership in his agency that they needed to build a design system, which is very much hot topic right now.

And his leadership was kind of on the fence because you know, as anybody who's ever built a design system knows, it's a huge amount of effort, like it requires a lot of work. So they were kind of like, well, I don't know, is it really going to save us enough time? So what this UX lead did was he looked at the amount of design and developer hours required to build specific element.

I think it was like a video plugin that they did for a recent client. So he looked at that. He looked at how much time it took them to do just this one element for this one client. And then he looked back through recent projects and counted the number of times that they had built, built different video plugin elements.

For different clients, which is essentially redundant work that doesn't need to happen every time. So he calculated that. He basically added all of that up and he was like, so this is an estimate of how much time we've wasted for one element. And so he used that as his argument to convince leadership.

And there's a couple of things I like about that example. So one thing I like about that is that again, he's thinking outside of like the obvious metrics. that you could look at, and it probably didn't take him very much time to get that estimate. And he could've gone further. Right. He could have also calculated it for, you know, form fields that they had redesigned and I don't know, global navigation elements.

So he could have looked at other elements and that would have been a lot more time and work for him. But he realized he didn't need to do it that way. He could sort of stop with the amount of time wasted. Just for this one element and that would be enough to make his point. So I think that's, you know, talking about like, is this worth someone's time?

I think that's a really big thing to consider is like come back to what point you're trying to make. Like who are you trying to convince and what are you trying to convince them of? And then you should do exactly as much work as is necessary in order to make that point.

JH: [00:34:55] Yeah, it's a really good point. I feel like whenever I get into quantifying stuff or pulling like, you know, hard data, like it's hard not to just jump into the mindset of wanting to quantify everything or like find all the adjacent related stuff. And so

Kate: [00:35:05] and then you'll never 

JH: [00:35:06] of like, exactly. Yeah. It's, it's really important skill.

Erin: [00:35:09] So we have about 15 minutes left here and we have a number of questions. So I'm just gonna look through to some of the more popular ones. And if you haven't gotten your question in or sort of cast your vote for questions that you might be interested in, also hearing the answer to, please go ahead and do that.

so there's one here about bias, which I think is interesting too, to a lot of people. anesthesia here says, you know, I've noticed when I sent a quantitative study, like a survey to a current user, any user who has never used the platform, I get overly positive feedback from the user who has never used the platform.

The non-users receive a paid incentive. How do you limit the bias from users who are paid an incentive.

Kate: [00:35:50] Well, I wouldn't, I would want to know a little bit more like detail about how that was run cause it sounded like. She is maybe surveying people who

are not users, but she's asking them about the product. So I think that is important to make sure that you're targeting people who have the right context.

So if you're going to ask people to evaluate a product, they should have used it and have used it recently. But this is a big problem. This like the sample bias is a huge problem with surveys, and it's, if you've ever looked at. the survey data that you collect through any kind of like, online feedback form survey, like any kind of intercept or like Qualaroo, like, I think they call them nudges that, you know, like pop up from the corner of your screen.

If you look at that data, you do tend to see polarization. You either see people who are very happy with you or people who completely hate you. And a lot of times that is, that's. Has to do with how you're recruiting people, like what time, how are you timing, how, that survey question or you know, who specifically is seeing it.

and then a lot of that has to do with how you structure the survey  Like when you have a really long survey that takes people a long time to complete. Then it's more likely that the only people who are going to sit there and go through all that are gonna be people who love you and want to tell you how great you are, where people who hate you and they're going to go through all that just so they can tell you about this one thing they're mad about. 

JH: [00:37:10] Yeah. People who have something to say. 


just, but just assume you can get the sampling. Right. how do you think about surveys in general as 

a benchmarking tool? Is that, like a good tool in the tool kit or would you go to the other options first?

Kate: [00:37:22] I think it is good. I tend to recommend, you know, not just relying on surveys alone, but pairing it with one of the observed methods. So I would pair that with quantitative usability testing or with analytics. because with analytics and quantitative usability testing, you're really looking at what people do with the surveys.

You're, you're listening to what they say. So those two different sources or types of data be really complimentary together. So I usually recommend trying to use a couple of those methods.  

JH: [00:37:49] Got another questionnaire. 

Erin: [00:37:50] Yeah. I've got some tactical questions that might be some quick hits here. that I'm, that I'm sure you have good experience with, with answering. Um, how do you measure time on task during a study? I try to stop watch. Are there some, you know. The platforms that you like to use for, for measuring that kind of

go to

Kate: [00:38:07] Are you wait, are 

Are you reading the question? Somebody else used a stopwatch or 


Erin: [00:38:10] And not anonymous attendee. You use the stopwatch and I think they're, they're looking for a better way. Yeah.

Kate: [00:38:16] Anybody who has been in UX research for a long time now probably remembers that there was a time where a stopwatch was the only way that you could get time

on tasks cause we didn't have the screen recordings or you could do them, but they were low quality or they would come like blow up your computer.

So that used to be like the traditional way that UX researchers would get time on task. But anybody who's tried that knows it's really easy to mess up. Like you can forget to start it and then you lose all that data. there for. Unmoderated studies. There's a lot of tools like loop 11, like UserZoom Userlytics I think recently added support for this that, um, have built in ways for you to collect those metrics using an unmoderated approach.

so that's part of the reason also why unmoderated is super popular for quantitative studies. for the most part, those are reliable. You still have to kind of go in and spot check to make sure nothing's happening. You don't have cheaters or outliers that you gotta take out of your data. There used to be a tool that you could use for in-person studies called Moray.

I don't know if you guys have ever encountered marae by TechSmith. they, like, I think like last year or the year before they announced they're going to. Retire it. So the way that I usually do it for an in person study is I'll just have video recordings. So I'll just go through, or something that you can have an intern too, or somebody new to the team can go through each of the recordings and just write down, here's where they started, here's where they stopped.

JH: [00:39:39] that's a good idea. And then you have the videos here, which is a nice 

Kate: [00:39:41] Yes. Right. And so then again, you get the qualitative insights from that quantitative study.

JH: [00:39:47] I'm going to go a little out of order on the votes. Just cause this seems kind of related from a Douglas de Santi was, what's your view on

session replay tools, which can work to be, 

bring a quantitative analysis to qualitative data, like watching, you know, a hundred sessions of real shoppers who left the checkout.

is that something that kind of fits in here if you're trying to do some of the stuff, or do those have some of the same pitfalls as like general analytics


Kate: [00:40:06] I think session recordings, works really well as like a compliment to your broader analytics. practice like your broader analytics research strategy. Like if you have, especially if you have a session replay tool that allows you to target specific sessions, that's pretty key because some of these tools, you'll just pick a random session and you don't know what's going to happen.

And that's not super useful. But I still think that that is. More than anything. That's kind of a substitute for actually user research. And I read really encouraged teams to try to do, you know, maybe even more unmoderated or qualitative, research rather than those session recordings.

Cause you still don't have a total understanding of what people are doing or you know, if they stop, you don't know if that's because. Their toddler just ran through the room and they're distracted, or it's because they're staring at the screen trying to read your, your bad content or something, so you just missed that context still.

Erin: [00:41:02] we've got a good question here from TA, I'm probably saying your name wrong. I apologize. Tire B. but they want to know what if the benchmarking results 

stakeholder success measures. So if you're working, you know, internally with the stakeholder, with a client, and you know, obviously, you know, you want to show from the benchmarking that this was worth the time and the expense.

What happens when it goes to the other way? Any practical tips for how to navigate that.

Kate: [00:41:25] And that happens, right? Like, you know, sometimes, sometimes that's the case. I mean, I, was talking to a friend of mine recently who works for a very large, Multinational corporation, and they were getting ready to push this redesign out. And she was telling me about how everybody that she worked with, especially everybody on the UX team, was really concerned about these design changes and didn't think that they were going to be good.

because it didn't come from user research. It came from. Someone in leadership who had a very specific opinion about how this should be, and none of that was based in an understanding of users. It's probably a situation a lot of people have encountered so because of this, she told me that they, the team was intentionally not going to collect their benchmark data.

They were going to like intentionally not look at the analytics data after this change was rolled out because they knew it  bad. But that I think is the wrong way to look at it. Like. The reality is out there, either people are responding to this design change in the way that you hoped they would or there aren't.

And that's super common. Like that's why we do research is because we can't always guess exactly how people are going to respond to something. And sometimes even when you're designing changes are founded in research. Sometimes that's the case. So it's just, it's just, you know, part of the work. I think it's healthy to see it that way.

But like. Closing your eyes and pretending it's not happening by not collecting the data is not the right way to tackle so I told this woman in particular, I was like, you know, maybe you should try to argue with your team that we should collect this data because then if it is bad, if it is as bad as you think it's going to be, then that gives you a little bit of ammunition to take that to your leadership and say, look, this is what's happening.

This is, this is how bad it is. So we should change this and we should do some, arguably, probably do some more qualitative research to help you learn how to make it better. So try to see it as like, no matter what the result is, even if it's not what you wanted, maybe it's something that can help you make an argument.

So that's another thing with benchmarking and ROI. Sometimes teams that I work with will, will calculate, what I call predictive ROI. So that's where you look at a gap. And that's kind of what that guy was doing with the design systems. He was seeing like, this is how much money we're wasting with this, this bad approach.

So that's also how much money we could be saving with this, with this other approach. So you can, even when it's not good data, you can still use it as an argument.

JH: [00:43:51] Yeah. It's one of those ones that's like, it's such a cliche, but like you learn more from the losses sometimes. You know what I mean? It's cause if you make a change, you're making it because you think it's going to make it better. Nobody's making changes will make the product worse. Let's do it.

and then when you find out that that's not true 

because you have this data. It does cause some reflection of like, well, where was our assumption wrong? Or where didn't we understand the users correctly? And, you know, it might be tough in the immediate term and 

that like stakeholder wrangling, but it is 

actually where you're going to get new insights and like better understanding.


Kate: [00:44:18] Yeah. Knowledge is power, and I think it's, you know, if you, if you. Expect for yourself that you will only ever make perfect design decisions, or you expect from your team or your stakeholders expect for you that you're only going to get it right a hundred percent of the time. That is not realistic. And so maybe you can use that data to try to set those expectations.

JH: [00:44:39] Yeah. Here's a quick, fun one on a tactical, point from a anonymous attendee. A lot of questions from anonymous today. 

Yeah. For Olympic questioner, if a participant thinks they completed the task correctly, do you classify that as a success or a fail or something else? Like if the,


Kate: [00:44:56] so I always have what I call success and failure. I'll refer to that. And I mean objective success or failure. So this is one of, we honestly, we don't have time to get into all of the differences with quantitative and qualitative usability testing. But one of the big ones is that you have to write your tasks so they have a specific end point.

So with qualitative, you know, I might, if I'm working with you know, a hospital client and we want to understand how people find new primary care physicians, I might say, you know, you're looking for a new doctor, find a new doctor. I can make it. Super open ended like that. Cause I want to know if they're gonna go to Google and look for reviews or if they're gonna reach out to their friends on Facebook and ask for recommendation.

I want to know all of that in a qualitative study with quantitative, I'm going to have to find, I'm going to have to say, you know, find the specific doctors. Profile because I need a specific end note. This is where I end my my time on task recording. And this is where I, how I know if this person is successful or not.

Now you might also have perceived success or failure and that's where you ask the participant if they succeeded or not. And if you're doing an unmoderated study that's usually built into the tool, it'll have someone say like, yes, I, I was successful, or no, I wasn't. and that can be interesting because.

A lot of times those things don't match up. A lot of times people are like, yeah, I got that right. And they did not. or the opposite happens and they think that they failed when they actually didn't. So that tells you, you know, maybe there's something confusing in the messaging there. so those are two separate things and you do need to keep them separate, but they are interesting to compare.


Erin: [00:46:31] Carrie, your wants to know, if you have any advice on methods for benchmarking, how fun something is. So, should that be reserved for qualitative studies, body language, facial expressions, and you know, how do you qualify fun? Are these sort of emotional kind of 

Kate: [00:46:46] Yeah,

So, that is kind of like a broader research question is like, how do we assess. Somebody's level of enjoyment with something. and I have heard of, of teams that are trying to build AI tools that can do that, that they can like look at someone's facial expressions and, you know, process whether or not they're happy or sad.

but the most recent update that I had heard was that those tools still really struggled. Like they were still very inaccurate. And largely because of. Our facial expressions are not always consistent. Like I might like furrow my brow because I'm thinking really hard about something, but it doesn't mean that I'm angry, for example. in my experience, you know, maybe this will change in the future, but in my experience so far, the best way to know. What someone's thinking is to ask them. So that's where like that survey, that reported piece comes in. So asking if something is fun, you could find that would be tricky. Like I don't know off, cause that's not something I often will ask and I'll ask if people are satisfied if it was easy or difficult and things like that.

But for fun. You'd have to find a sort of a careful way to phrase the question, but that is something that you could capture on a rating scale. Another thing you could do is if desirability test, which is where you give people a list of adjectives or words, and then the trick is you has to, it has to be a fairly large list and it has to be a diverse enough list like you have to, you can't just be like, do you think this product is fun, amusing, or great

Erin: [00:48:09] Yeah. Fun or really fun or super fun.

Kate: [00:48:12] fun or extra fun?

Yes, you have. Right? You have to have like fun or boring, you know, you've got to have the antonym pairs usually. so you give people a list and you say, choose the things that you know, that you feel like. Describe. The experience you just had. and then when you do that quantitatively over hundreds of people, what you'll find is that the same words start to kind of rise to the top.

So you can look at like, okay, 25% of people said that they would describe this as fun. And that's one of the key brand values or tone of voice values that we want to have for this product. And so 25% maybe isn't what we wanted. And so that's something that we're going to have to work on in the future, for example.

So there's kind of different ways to approach it, but the best way is just to ask. 

Erin: [00:48:56] yeah, I think it is that a good place to end it. Any parting words of wisdom for us? Kate.

Kate: [00:49:02] Yeah, I would just you know, if this is something you haven't looked into before and you're not familiar with, just take some time to, to educate yourself. We've got a lot of free articles on our website and in I'm also a big fan of measuring you. So their website is measuring

That's led by Jeff sorrow. they have a lot of resources, resources, specifically around quantitative UX research and analysis and all that stuff. so that's a, that's a resource I'd definitely recommend. 

Erin: [00:49:31] Okay. Fantastic. Yeah. The most, the most popular question we didn't get to it is, um, how do you measure benchmarks? So, you know, I think we talked about right, some, a number of ways, but maybe that's something we can add 

some more resources, to answer that question.


tools and

Kate: [00:49:48] Yeah, I've got, I've got a long list of recommended research resources, 


Erin: [00:49:52] So we'll make sure to share some of that in the followup.

Carrie Boyd

Content Creator

Carrie Boyd is a Content Creator at User Interviews. She loves writing, traveling, and learning new things. You can typically find her hunched over her computer with a cup of coffee the size of her face.

More from this author