Designing beyond devices – Cheryl Platz on The Product Experience "Product people - Product managers, product designers, UX designers, UX researchers, Business analysts, developers, makers & entrepreneurs November 11 2021 False podcast, Product Design, Mind the Product Mind the Product Ltd 5649 Product Management 22.596

Designing beyond devices – Cheryl Platz on The Product Experience

BY and ON

Just when you’ve finally convinced everyone to work mobile-first, the world goes multimodal. If you’re not already working on how you might need to plan for voice interaction and conversation design, physical interaction, and even augmented reality, don’t worry – you’re not alone. We sat down with Cheryl Platz  (author, and now Director of UX, Player Platform for Riot Games) to learn how to adapt our products.

Featured Links: Follow Cheryl on LinkedIn and Twitter | Cheryl’s Website | Cheryl’s book ‘Design Beyond Devices: Creating Multimodal, Cross-Device Experiences’ | Capturing Customer Context – free downloadable worksheets | ‘How Calm Technology Can Help Us Be More Human’ talk by Amber Case

Episode transcript

Lily Smith: 

Hey Randy, let’s fit people in behind the scenes of the podcast. What’s the weirdest thing that’s happened while we’ve recorded an episode?

Randy Silver: 

Let’s see Lily. Besides the time my internet failed completely or this week when squad cast decided to randomly and permanently mute our guests, I think it’s actually when so-called Smart speakers have randomly interrupted our guests.

Lily Smith: 

It was very annoying, very inconvenient because I was really enjoying the conversation. But anyway, it’s not even Halloween yet at the time of recording. So it’s not time for the Gremlins to come out and fiddle with our stuff. We know it’s not really actually Gremlins, just weird interactions with our devices. But anyway, that’s exactly why we are Cheryl Platts, to join us this week to explain how to design beyond devices.

Randy Silver: 

And that’s exactly what her book is called Design beyond devices, creating multimodal cross device experiences. sound smart. And you know, despite squad cast best efforts, we had a great chat. So let’s not tempt fate any further, let’s get straight into it.

Lily Smith: 

The product experience is brought to you by mind the product.

Randy Silver: 

Every week, we talk to the best product people from around the globe about how we can improve our practice, and build products that people love.

Lily Smith: 

Catch up on past episodes and to discover an extensive library of great content and videos, browse for free, or become a mind the product member to unlock premium articles, unseen videos, AMA’s roundtables, discounts to our conferences around the world training opportunities.

Lily Smith: 

mining product also offers free product tank meetups in more than 200 cities. And there’s probably one day you

Randy Silver: 

Cheryl, thank you so much for joining us today. So excited to see you too. So for anyone who hasn’t read your book, or seen you do talks or watched you do improv or any of that, can you just give a quick intro for us for to tell everyone who’s listening? What is it you’re doing these days? And how did you get into this whole world of products related stuff?

Cheryl Platz: 

Sure. So hi, everybody. Again, I’m Cheryl Platts. And for a while, I was calling myself the 20 sided woman because I’m a nerd. My husband liked to call me that because I have a lot of different things I like to do, as Randy sort of alluded to. So I’ve been working on digital products for my entire career. But I also have a number of applications. And those include acting and you know, those include creative pursuits, like more artistic things, and that, you know, that multifaceted set of interests is what got me into user experience design in the first place. I wanted something that combined technology, and creativity. And that’s also what drew me into my first of, what I call my three chapters in my career of video games, enterprise and consumer software and hardware products. And it’s that first chapter, the video game chapter where I first got exposed to multimodal interface design, which is when you’re working with multiple inputs and outputs, have modes of interaction with the device. And that has influenced my entire career. And that is pretty much what’s brought us here today to talk because in December 2020, I published my first book design beyond devices, Creating Multimodal, Cross-Device Experiences. So I went from creating, being the lead producer on a game for the Nintendo DS, which was kind of groundbreaking and a couple of a couple of ways. Fast forward, I was working on Windows, automotive and doing voice natural language in the car and touch interesting context. Then I worked on Cortana and Alexa. And, then fast forward to today and all the talks I’ve been giving in the content of my book, but I’m also very passionate about enterprise experiences and working with AI and working at scale. And so my book is about kind of marrying all of these things. Because whether you’re working on cutting edge, like homes, smart speaker technology, or whether you’re working on a website that has to span from a phone to a to a traditional desktop website, you are adapting at the moment, you may be crossing interaction modalities, you’re definitely crossing device boundaries, and there are a lot of interesting design challenges along the way. And as Randy alluded to, I also do some improv which influences a lot of my work.

Randy Silver: 

Okay, there is a tonne there. And a lot of us, you know, we’re still working in organisations where hopefully we’ve convinced people to at least be thinking mobile-first. But you were the to a whole bunch of other things. Is that still where we should be going? Should we be trying to influence our organisations to be mobile-first? Or is it multimodal first? Or is it what’s the right mindset? Now?

Cheryl Platz: 

It’s a really great question and not to like, it’s gonna sound a little rote coming from a designer, but it really does need to be customer first, or as I kind of bring home the point of my book context first, what does your specific customer need, if your customer is buried to their phone, and their phone is with them all the time, then? Yeah, mobile first, probably still makes sense. But a lot of our customers are stuck at home, for example, now. And so their phone might be sitting on a desk, and they’re moving back and forth between rooms. And there’s a smart speaker and there’s a mobile, there’s, you know, there’s a tablet computer. And so maybe mobile first doesn’t make as much sense anymore. Or maybe they’ve gotten more used to just barking out commands to it, you know, in, in their safe home environment, where an open workspace It was awkward to use voice commands. So context matters, context has changed significantly in the last year and a half. A lot of the assumptions we had about the way people worked are no longer and lived are no longer valid. And so like the second chapter of my book is basically all about like, how do we extend the way we talk to our customers to get at the real truth of the new context? So we can answer that question like, is it really mobile first for your customers? Is it watch first, today, augmented reality? First is it voice first. And the other point I make a lot in my book is that really putting any one modality first may be missing the point because anytime we focus on one type of interaction, we may be leaving people behind. Because if somebody is, least the mobile, when we say mobile-first, at least there’s usually touch and voice and there are some multiple things, but we, you know, a lot of people say voice first, and like, that’s great, but we’re leaving people with acoustic disabilities potentially behind just as we’d left people with visual disabilities behind at the beginning of the computing revolution, right. So what I tell people is we may be leaving people behind, if we focus too much, you know, if we say voice first, we may be people leaving people behind who can’t process acoustic stimuli, just like at the beginning of the computing revolution, we might have been leaving people behind who couldn’t process visual stimuli. And so the more flexible we can be, the better for everybody.

Lily Smith: 

And you mentioned, like in the last year and a half our assumptions around the way that people live and work have all kind of been shaken about and turned upside down. And in terms of like, how people think about a lot of the different devices or touch points or interactions that are available to us now in the home, is it there was a definitely a period of it being quite gimmicky, and not kind of actually useful, practical, but just interesting and innovative. So, do you think we’re now getting out of that phase? It took mobile phones, probably a good decade to really embed in the norm of people’s lives? So what do you see that happening a lot faster with things like voice and your other potential interactions?

Cheryl Platz: 

I think we’re at a really important reflection point right now. And what’s been really interesting to me is there’s been this cliff for voice and it’s the productivity cliff, we’ve, as you mentioned, there’s been some gimmicky stuff, and then there’s been some stuff that’s genuinely useful, but more in the leisure and, and the home space, you know, timers, and timers and alarms and reminders, super useful, but more in the home space. But we talked about like when I was working on Cortana in 2014. We were working on emails and calendaring and scheduling. And that’s never taken hold in voice. And I think there’s a lot of factors for that. A lot of that was the open workspace and the fact that it’s really awkward to talk to a computer when there are 80 people around you I think we’re at this really interesting point now people are working from home maybe it is now actually reasonable to like actually bark out and say hey, you know, Okay Google or whatever Hey, what’s my next meeting? What can you put this on my calendar, but we’ve got this gap where people basically iced all that work because nobody was using it. So there’s a big opportunity, but I don’t I haven’t seen the industry catch back up again to where we were thinking seven years ago.

Randy Silver: 

So if I need to start considering this and trying to figure out what my smart devices are, and trying to just explore beyond where I am today, you’ve talked about dimensions. And I know you’ve got a two by two grid. So it’s a totally leading question. So we love we’re product people. But can you give us an idea of how should we think about this? How do we approach this and map it out?

Cheryl Platz: 

Great question. And because it’s a daunting thing is a daunting space. And when you think about all the different products that use multiple input modalities, now put modalities, there’s a huge difference between your smart speaker and the echo show or the Google Home Hub, which has the screen and voice and the TV experience, you have maybe your Comcast x one or whatever that you can talk to with your remote. Those are three very different interaction models, the amount of voice interaction that you have, the amount of it gives you back the amount of touch, or the amount of visuals all very different, like where do you as a product person decide to place yourself. And my experience, especially as we were trying to, like birth, some of this space at Amazon, back in the day was that there were two dimensions of the customers context and scenarios that determined what interaction model you really needed. And those two dimensions were number one, how rich is the information that needs to be communicated on average.

For example, you might have low information density, you might have things like, my customer really only needs small chunks of information, like the current temperature, or the result of a current timer, or a sports score or something like that little snippets, versus something that’s much higher information density, like they’re navigating an entire set of movie times, or they’re listening to an audiobook or something along, or they’re listening to 10 day forecasts, on average, those things, you know, the more information density we get, the less an interaction modality. Audio makes sense, because the brain has a harder time processing a lot of information at once from on the audio channel. And so then we start to think about, like, what other interaction forms will help people work with more information better visual is usually better in that sense. So that’s one of the things, ask yourself, how heavy is the information we’re working with? In this scenario? Or in this product? In general? You can do it either way scenario or product? And then the other dimension is your customers’ spatial relationship with the device on which the experience is taking place? And by that I mean is, are they typically close to the device within arm’s reach? So basically, three feet or less? Or are they able to, are they moving around, can they be three to 10 feet away from the device, because if they are three to 10 feet away from the device, typically, you can no longer assume that they are looking at the device.

In many cases, there are a few exceptions, but and that starts to change kind of the things you can build into your assumptions on the product. Like if I’m not new, first of all, if you’re not near the product, you can’t touch the product. So the types of interactions you have available to you are totally different, then you really need to lean on voice, unless you have a really good strong remote control, or you’re using the mobile phone as a secondary model or something like that. And secondarily, if they’re not near it, there’s a very likely, you know, they may turn around, and then you can’t assume you have visual contact either. And so knowing the customer’s physical relationship, now they might get close to the device, but knowing the extent of their range, and their relationship with the device is important. Once you know those two things, and you can get that through customer research or by knowing exactly what device you’re targeting, or a bit of both. I’ve got a chart in my book where I sort of I plot one of these dimensions on the horizontal axis, one about the vertical axis, and then we get four quadrants, and I call this the spectrum of multimodality. And up in the upper right-hand quadrant, we’ve got adaptive experiences, which allow people to basically choose the way they want to interact with the device in the moment, a lot like your Echo show or your Google Home Hub. Like if I’m near it, I can touch it, if I’m far away, I can speak to it. But in other quadrants, you know, if, if your relationship with the device is far away, and it’s low information density, speaking to the device, and intangible experience makes a lot of sense. But if you’re close to the device, and you have really rich information, then it anchored rich experience with a lot of information like a fire TV, or you know, other TV experience or your desktop experience makes a lot of sense. And if the device is very close to you, like Google Glass, I guess RIP AR, or augmented reality or your watch, then a direct experience where it’s very constrained. And you’re inferring a lot of information from sensors so that there’s not a lot of manipulation required from the customer. That’s the kind of interaction model you’re probably going to go with.

Lily Smith: 

So I suppose want to like just thinking about the relationship as well that you have with different devices. Now, you kind of mentioned about being close to a device, but in the instance of a watch, or even like a car, you are close to that device, but then you are also your attention is generally elsewhere, like you’re not looking at it. And like, how do you get to the point where you’re making a decision around, okay, this is the service or the product or the problem that I’m trying to solve? And, you know, ultimately, they probably have a website already. But like, how do you then begin to explore all of these different ways in which you can interact with your customer or your user? And in all of these different scenarios, because it feels like such a huge, like wealth of opportunity, but you know, like, yeah, I just how do you how do you even start? Where do you start?

Cheryl Platz: 

It’s a great question. And it is, it can be really daunting when you think about these things. One of my most popular medium posts was talking about how, you know, the the voice recognition scenarios that took off on Alexa were actually not new scenarios, they were things that were already supported on mobile, but they were scenarios that weren’t convenient on mobile, they were things that when you looked at them in context, they were awkward, I have $1,000 phone, and yes, I can set timers on it. But if my hand is covered in butter, I don’t want to touch the $1,000 phone. And so that’s why I harp on context, so much like when you, when you observe your customers, when you listen to them, it may well be that the things they already have services for are still not meeting their needs in the right way. And we have these devices that are capable of so much but are not being used to their fullest capacity. And whether or not somebody is living with a disability, you know, the Microsoft inclusive design toolkit has a really useful sort of spectrum, both permanent and situational disabilities. And, you know, like, they sort of talk about like, well, if I’m holding a child, I’m going to situational disability where I don’t have to use my arms, there’s a lot of those things that crop up in real life. And so that’s often kind of a key stone, like a keystone moment that can help you figure out where a multimodal interaction might make a lot of sense on a scenario that your customer already has had something for. But I’m really glad you brought up attention because attention is a really big part of this too. And one of the chapters in my book, I talk about something called an activity model. Because essentially, I don’t know if any listeners have had this experience where like you try to open an app and the apps like I’ve got a load, so you’re gonna have to wait. So you go try to do something else you pick, you open up an email, you start typing. And then after a few seconds, the app you tried to open is like I ready now and interrupt you in the middle of a sentence like shifts your focus over, which is rude. The computer knows you’re in the middle of a sentence, like you’re typing in, the computer has all of this information available, and yet, it’s still bonks you back from the email to the original app, it should know better, but we’ve never taught our devices, what rude means or like what what, what a human activity means in which in which the context an interruption would be rude. And so that I talk about this this concept of activity modelling where we say like for your customer, in the situation you’re trying to support with a product, what are the patterns of activity in which an interruption would be dangerous, or rude or acceptable. And, you know, then you take the different scenarios you’re trying to support, you kind of matrix them on top of the activities, and you come up with patterns. So you’re like this, in this situation when my customers running, I’m not going to use I’m not going to distract them too much, because I don’t want to trip them. I’m going to use stuff that’s very subtle, or I’m only going to use audio. I’m not going to try to get them to look at the watch. But if I know they’re standing because we’re not moving then I will use visual stimuli for example.

Randy Silver: 

I’ve got the sushi It’s like a mix of Tron and inside out. And I’ve I’ve often consumed my devices to be annoying, but I’m not sure I ever considered them to be intentionally rude before and that’s great. But you were talking about activity modelling and can we just go a little bit deeper on that because we’ve done lots of stuff in the past about story mapping and other kinds of journey, mapping types of things, but what is exactly is activity model.

Cheryl Platz: 

So activity modelling is you take a look at the patterns of behaviour for your customer in the space that your product touches now and you know, for a product like Alexa, this is very broad, because it’s basically like all human behaviour in the home. But you know, in a workplace, it might be much more constrained to like the types of tasks person is conducting at their desk, for example, and you take a look at those, and you might you deconstruct that behaviour based on a couple of different guiding criteria. For me, it was things like, is this activity interruptible? Safely? Can I resume it later? And what is the cost of resuming it later, like is it you know, if I’m in the flow, and I’m writing something, and I get interrupted, it’s going to take me extra time to get back in that flow. And so if I’m, if I’m interrupted, while writing a word, Doc, it’s resumable. But at a cost. Whereas if I’m on a phone call, and I get interrupted, they potentially lost that context forever. The especially if it’s like a conference call, but other people are continuing without me. So those are fundamentally two different types of activity patterns, one of them, my context, will be lost forever, when I get interrupted and the other one, I can, my context can get saved, but it’s at a cost to me as a person. And then there are other activities where, or situations where maybe I, you know, it’s very easy for me to pick up later, maybe I’m just filling out a form and it’s all gonna be saved. There’s no cost later, everything’s perfect. And so then we have three activity patterns. And once it ends, there’s no one right activity model, because it does depend on context. But I do propose, I do share kind of what I worked on, on Alexa notifications as sort of like a case study is like a baseline you can use. And so we had activity patterns, like, you know, the customer is, what when type of activity was short running tasks, so the customer is engaged with the product, doing something like getting a weather report, or getting the answer to a question. So those activities can be interrupted. But, you know, the cost of resuming them is so small that we don’t even bother like saving your place. We’re just like, hey, you know what, if you need to do this, again, you can just ask, versus a long running task, where it was something like listening to music, we can those those you could pause or attenuate, like, make the volume go down. So we, you know, the type of behaviour when we interrupted you was fundamentally different. You didn’t have to lose the context of what you were doing when we interrupted you. And then we had live activities, which were things like phone calls, or like, if you were listening to like the like watching TV or something goes, you, you would lose the context if we interrupted you. And so whereas we might, if you had an incoming call, and we had the caller ID information, we would decide whether or not to announce the caller ID information based on what you were doing. So if you were listening to like the weather report, we might just interrupt it and say like, okay, it Cheryl Platz is calling because it’s a time sensitive thing. And you can get the weather in five seconds, we don’t want you to miss this call. If you’re listening to music, we’ll turn the volume down a little bit and say like, okay, Cheryl Potts is calling. And if you’re on a phone call ready, we would just play the dial tone, or like we play the call, like the incoming call tone, but we wouldn’t announce it because you’re already in a situation where you would lose context if we interrupt you in that way. And then we provide you other ways to get that information if you needed it, like a banner, or like a notification on your phone. And so we’re making we use that activity model to make informed choices about how we interrupt you in the moment.

Lily Smith: 

And so if I want to dive into this, if I think there are opportunities to, to go more multimodal. To go more multi modal. What’s the word? Who do I need on my team like and what’s the best place to get started? Like if I have a UX person who’s you know, mainly sort of got experience and more like website UX? Is that sufficient? I mean, I guess it depends on the person. But how do we sort of start to dabble in multimodal products?

Cheryl Platz: 

So if you have a designer who is familiar with the interaction design side of UX, sometimes you have visual focus designers, and this may be a little bit more stretch for them. But if someone is an interaction designer, this is a set of skills or thinking that layers on top of their existing skillset. And we all start somewhere right. I did wasn’t a multimodal designer. At first, we all start somewhere. And they’ll also benefit from a very strong partnership with their product managers, their programme managers because there is a lot of technical constraints. There’s a lot of complexity here. So you can go on this journey together, you can learn about your customers, there are some materials on my website that can help you get started with a shared understanding workshop where you kind of dump all your current understanding of your customer on the table, and then start working towards what gaps still exist in your understanding of your customer that you’ll need to fill in order to plot your experience on the spectrum of model, multimodality and make informed decisions about maybe your activity model and what scenarios you need to support.

Lily Smith: 

And you have a really nice method of understanding your customers, which I hadn’t come across before. That comes from your improv, I believe, called Yellow. Yes. Tell us about how that works.

Cheryl Platz: 

And so I can’t take credit for the acronym itself. But the I the CRO acronym is a tool we use at my improv theatre that I’ve worked with for the past 13 years, I think unexpected productions in the Pike Place Market in Seattle, which is a fun fact is the reason the gum wall exists if you’ve ever heard of that, that’s a fun rabbit hole we won’t talk about here. But when you are a professional improv performer, there is a high bar for your narrative because people have paid to see you perform, and they expect you to tell interesting stories. And so we use this acronym to represent a reminder of what elements the human brain basically needs to find a story kind of compelling enough that it can fill in the blanks. And so I took that as an improviser, slash, sort of user researcher and stuff like that kind of also tells us a map of like what kind of represents a holistic scenario like a holistic person, a holistic environment. And so CRO stands for character, relationship objective, and where and so I’ve broken that down for folks both in Medium posts in my book, and in some materials on my website, so that you can use these four letters to kind of deconstruct your understanding of the customer and extend your existing customer outreach or your existing user, user experience, research practice. To get the additional context, you’ll need to make informed multi-modality choices. So you know, really getting into the character part and understanding like, it’s not just like asking, Hey, like, who are you? And how old are you? But like the difference between like, what is fundamental to them? Like what marginalised groups, or maybe they’re a member of that are going to influence their ability to interact with some of these modalities? And what perceptions are they going to bring, that’s going to influence their ability to like, what gestures they’re going to think are appropriate, for example, to the relationship part, I talked about the importance of asking about human to human relationships, we talk a lot about human to computer and human to company. But do we talk about the relationship of your customer to the other customer, the people in their home, if they’re sharing a phone, if they’re sharing the computer that has an impact on, you know, if you’re using the Alexa device, and you’re switching profiles, that that adds a whole layer of complexity. If your kids using your phone that and switching profile that adds a whole layer of complexity. If you’re trying to do a banking app on Alexa and they have someone in their home, they don’t trust. That’s a whole bunch of a deal-breaker. And so asking questions and broadening the way you ask questions really important. And then the objective part is a really good reminder for us all to make sure that we’re grounding our user stories in real human needs. And not like tight, like, technical needs, you know, I don’t browse a list of virtual machines. For fun, I browse it because I want to find the broken virtual machine and reboot it, for example, but and then the were really important for multi modality because it’s usually the environment that’s driving a lot of the changes customers make to, you know, whether they’re interacting with a phone or a smart speaker or a desktop. It’s like, where is the customer? What devices are in arm’s reach? What distractions are their understanding that were more and not taking it for granted that like, everyone’s home office is the same or everyone’s kitchen is the same? We’ll help you make the case to your stakeholders that it’s worth investing in this extra technology. Because like when I worked on the Echo, look at Amazon, they were like, Well, why don’t we just use an app? You know, just use a phone app for this. But when we talked to our customers, and we watched our customers, number one, their phones were with them when they were making clothing decisions. And number two, when they use both the voice and the app, they were like I’m not gonna buy this without voice using the app is awkward. They wanted it hands-free because they’re getting dressed. They like that’s a very, very hands-on experience. context matters. The wear matters, the Environment Matters.

Randy Silver: 

So Cheryl, thank you so much. We’ve really enjoyed all this. We’re running out of time. So let’s try and get one more question in. I’m just curious for people who are just getting started In working with a multimodal mindset and taking their products into a different space, What’s the mistake you see people making? Most often, you know, what’s the thing that we should all watch out for?

Cheryl Platz: 

Well, I think problems are sort of solutions in search of problems. Get falling in love with a particular input modality and coming up with something really complicated without it being grounded in a customer need is the biggest mistake I see. That’s why I lead in my book with customer context first. If I know it’s really exciting to try to like boil the ocean and put all the bells and whistles in, but at the end of the day, the simple solutions are often the best. And it can be an A lot of times it’s a simple voice command that infers a lot of what you need from sensors and previous settings is a lot more valuable than a really well-executed multi-turn voice interaction that also supports gesture. So it’s I think that’s probably the biggest mistake I see is assuming that the end product has to be complicated. Or assuming that what people want is something showy and new.

Randy Silver: 

Cheryl, thank you so much for joining us. We really enjoyed this. You’ve got a whole bunch of resources on your website. We’re going to link to them all in the show notes in the article. So if you want to follow up on Cheryl’s book, see your talks Ted download at the worksheets please go there

 

[buzzsprout episode='9414609' player='true'] Just when you've finally convinced everyone to work mobile-first, the world goes multimodal. If you're not already working on how you might need to plan for voice interaction and conversation design, physical interaction, and even augmented reality, don't worry - you're not alone. We sat down with Cheryl Platz  (author, and now Director of UX, Player Platform for Riot Games) to learn how to adapt our products. Featured Links: Follow Cheryl on LinkedIn and Twitter | Cheryl's Website | Cheryl's book 'Design Beyond Devices: Creating Multimodal, Cross-Device Experiences' | Capturing Customer Context - free downloadable worksheets | 'How Calm Technology Can Help Us Be More Human' talk by Amber Case

Episode transcript

Lily Smith:  Hey Randy, let's fit people in behind the scenes of the podcast. What's the weirdest thing that's happened while we've recorded an episode? Randy Silver:  Let's see Lily. Besides the time my internet failed completely or this week when squad cast decided to randomly and permanently mute our guests, I think it's actually when so-called Smart speakers have randomly interrupted our guests. Lily Smith:  It was very annoying, very inconvenient because I was really enjoying the conversation. But anyway, it's not even Halloween yet at the time of recording. So it's not time for the Gremlins to come out and fiddle with our stuff. We know it's not really actually Gremlins, just weird interactions with our devices. But anyway, that's exactly why we are Cheryl Platts, to join us this week to explain how to design beyond devices. Randy Silver:  And that's exactly what her book is called Design beyond devices, creating multimodal cross device experiences. sound smart. And you know, despite squad cast best efforts, we had a great chat. So let's not tempt fate any further, let's get straight into it. Lily Smith:  The product experience is brought to you by mind the product. Randy Silver:  Every week, we talk to the best product people from around the globe about how we can improve our practice, and build products that people love. Lily Smith:  Catch up on past episodes and to discover an extensive library of great content and videos, browse for free, or become a mind the product member to unlock premium articles, unseen videos, AMA's roundtables, discounts to our conferences around the world training opportunities. Lily Smith:  mining product also offers free product tank meetups in more than 200 cities. And there's probably one day you Randy Silver:  Cheryl, thank you so much for joining us today. So excited to see you too. So for anyone who hasn't read your book, or seen you do talks or watched you do improv or any of that, can you just give a quick intro for us for to tell everyone who's listening? What is it you're doing these days? And how did you get into this whole world of products related stuff? Cheryl Platz:  Sure. So hi, everybody. Again, I'm Cheryl Platts. And for a while, I was calling myself the 20 sided woman because I'm a nerd. My husband liked to call me that because I have a lot of different things I like to do, as Randy sort of alluded to. So I've been working on digital products for my entire career. But I also have a number of applications. And those include acting and you know, those include creative pursuits, like more artistic things, and that, you know, that multifaceted set of interests is what got me into user experience design in the first place. I wanted something that combined technology, and creativity. And that's also what drew me into my first of, what I call my three chapters in my career of video games, enterprise and consumer software and hardware products. And it's that first chapter, the video game chapter where I first got exposed to multimodal interface design, which is when you're working with multiple inputs and outputs, have modes of interaction with the device. And that has influenced my entire career. And that is pretty much what's brought us here today to talk because in December 2020, I published my first book design beyond devices, Creating Multimodal, Cross-Device Experiences. So I went from creating, being the lead producer on a game for the Nintendo DS, which was kind of groundbreaking and a couple of a couple of ways. Fast forward, I was working on Windows, automotive and doing voice natural language in the car and touch interesting context. Then I worked on Cortana and Alexa. And, then fast forward to today and all the talks I've been giving in the content of my book, but I'm also very passionate about enterprise experiences and working with AI and working at scale. And so my book is about kind of marrying all of these things. Because whether you're working on cutting edge, like homes, smart speaker technology, or whether you're working on a website that has to span from a phone to a to a traditional desktop website, you are adapting at the moment, you may be crossing interaction modalities, you're definitely crossing device boundaries, and there are a lot of interesting design challenges along the way. And as Randy alluded to, I also do some improv which influences a lot of my work. Randy Silver:  Okay, there is a tonne there. And a lot of us, you know, we're still working in organisations where hopefully we've convinced people to at least be thinking mobile-first. But you were the to a whole bunch of other things. Is that still where we should be going? Should we be trying to influence our organisations to be mobile-first? Or is it multimodal first? Or is it what's the right mindset? Now? Cheryl Platz:  It's a really great question and not to like, it's gonna sound a little rote coming from a designer, but it really does need to be customer first, or as I kind of bring home the point of my book context first, what does your specific customer need, if your customer is buried to their phone, and their phone is with them all the time, then? Yeah, mobile first, probably still makes sense. But a lot of our customers are stuck at home, for example, now. And so their phone might be sitting on a desk, and they're moving back and forth between rooms. And there's a smart speaker and there's a mobile, there's, you know, there's a tablet computer. And so maybe mobile first doesn't make as much sense anymore. Or maybe they've gotten more used to just barking out commands to it, you know, in, in their safe home environment, where an open workspace It was awkward to use voice commands. So context matters, context has changed significantly in the last year and a half. A lot of the assumptions we had about the way people worked are no longer and lived are no longer valid. And so like the second chapter of my book is basically all about like, how do we extend the way we talk to our customers to get at the real truth of the new context? So we can answer that question like, is it really mobile first for your customers? Is it watch first, today, augmented reality? First is it voice first. And the other point I make a lot in my book is that really putting any one modality first may be missing the point because anytime we focus on one type of interaction, we may be leaving people behind. Because if somebody is, least the mobile, when we say mobile-first, at least there's usually touch and voice and there are some multiple things, but we, you know, a lot of people say voice first, and like, that's great, but we're leaving people with acoustic disabilities potentially behind just as we'd left people with visual disabilities behind at the beginning of the computing revolution, right. So what I tell people is we may be leaving people behind, if we focus too much, you know, if we say voice first, we may be people leaving people behind who can't process acoustic stimuli, just like at the beginning of the computing revolution, we might have been leaving people behind who couldn't process visual stimuli. And so the more flexible we can be, the better for everybody. Lily Smith:  And you mentioned, like in the last year and a half our assumptions around the way that people live and work have all kind of been shaken about and turned upside down. And in terms of like, how people think about a lot of the different devices or touch points or interactions that are available to us now in the home, is it there was a definitely a period of it being quite gimmicky, and not kind of actually useful, practical, but just interesting and innovative. So, do you think we're now getting out of that phase? It took mobile phones, probably a good decade to really embed in the norm of people's lives? So what do you see that happening a lot faster with things like voice and your other potential interactions? Cheryl Platz:  I think we're at a really important reflection point right now. And what's been really interesting to me is there's been this cliff for voice and it's the productivity cliff, we've, as you mentioned, there's been some gimmicky stuff, and then there's been some stuff that's genuinely useful, but more in the leisure and, and the home space, you know, timers, and timers and alarms and reminders, super useful, but more in the home space. But we talked about like when I was working on Cortana in 2014. We were working on emails and calendaring and scheduling. And that's never taken hold in voice. And I think there's a lot of factors for that. A lot of that was the open workspace and the fact that it's really awkward to talk to a computer when there are 80 people around you I think we're at this really interesting point now people are working from home maybe it is now actually reasonable to like actually bark out and say hey, you know, Okay Google or whatever Hey, what's my next meeting? What can you put this on my calendar, but we've got this gap where people basically iced all that work because nobody was using it. So there's a big opportunity, but I don't I haven't seen the industry catch back up again to where we were thinking seven years ago. Randy Silver:  So if I need to start considering this and trying to figure out what my smart devices are, and trying to just explore beyond where I am today, you've talked about dimensions. And I know you've got a two by two grid. So it's a totally leading question. So we love we're product people. But can you give us an idea of how should we think about this? How do we approach this and map it out? Cheryl Platz:  Great question. And because it's a daunting thing is a daunting space. And when you think about all the different products that use multiple input modalities, now put modalities, there's a huge difference between your smart speaker and the echo show or the Google Home Hub, which has the screen and voice and the TV experience, you have maybe your Comcast x one or whatever that you can talk to with your remote. Those are three very different interaction models, the amount of voice interaction that you have, the amount of it gives you back the amount of touch, or the amount of visuals all very different, like where do you as a product person decide to place yourself. And my experience, especially as we were trying to, like birth, some of this space at Amazon, back in the day was that there were two dimensions of the customers context and scenarios that determined what interaction model you really needed. And those two dimensions were number one, how rich is the information that needs to be communicated on average. For example, you might have low information density, you might have things like, my customer really only needs small chunks of information, like the current temperature, or the result of a current timer, or a sports score or something like that little snippets, versus something that's much higher information density, like they're navigating an entire set of movie times, or they're listening to an audiobook or something along, or they're listening to 10 day forecasts, on average, those things, you know, the more information density we get, the less an interaction modality. Audio makes sense, because the brain has a harder time processing a lot of information at once from on the audio channel. And so then we start to think about, like, what other interaction forms will help people work with more information better visual is usually better in that sense. So that's one of the things, ask yourself, how heavy is the information we're working with? In this scenario? Or in this product? In general? You can do it either way scenario or product? And then the other dimension is your customers' spatial relationship with the device on which the experience is taking place? And by that I mean is, are they typically close to the device within arm's reach? So basically, three feet or less? Or are they able to, are they moving around, can they be three to 10 feet away from the device, because if they are three to 10 feet away from the device, typically, you can no longer assume that they are looking at the device. In many cases, there are a few exceptions, but and that starts to change kind of the things you can build into your assumptions on the product. Like if I'm not new, first of all, if you're not near the product, you can't touch the product. So the types of interactions you have available to you are totally different, then you really need to lean on voice, unless you have a really good strong remote control, or you're using the mobile phone as a secondary model or something like that. And secondarily, if they're not near it, there's a very likely, you know, they may turn around, and then you can't assume you have visual contact either. And so knowing the customer's physical relationship, now they might get close to the device, but knowing the extent of their range, and their relationship with the device is important. Once you know those two things, and you can get that through customer research or by knowing exactly what device you're targeting, or a bit of both. I've got a chart in my book where I sort of I plot one of these dimensions on the horizontal axis, one about the vertical axis, and then we get four quadrants, and I call this the spectrum of multimodality. And up in the upper right-hand quadrant, we've got adaptive experiences, which allow people to basically choose the way they want to interact with the device in the moment, a lot like your Echo show or your Google Home Hub. Like if I'm near it, I can touch it, if I'm far away, I can speak to it. But in other quadrants, you know, if, if your relationship with the device is far away, and it's low information density, speaking to the device, and intangible experience makes a lot of sense. But if you're close to the device, and you have really rich information, then it anchored rich experience with a lot of information like a fire TV, or you know, other TV experience or your desktop experience makes a lot of sense. And if the device is very close to you, like Google Glass, I guess RIP AR, or augmented reality or your watch, then a direct experience where it's very constrained. And you're inferring a lot of information from sensors so that there's not a lot of manipulation required from the customer. That's the kind of interaction model you're probably going to go with. Lily Smith:  So I suppose want to like just thinking about the relationship as well that you have with different devices. Now, you kind of mentioned about being close to a device, but in the instance of a watch, or even like a car, you are close to that device, but then you are also your attention is generally elsewhere, like you're not looking at it. And like, how do you get to the point where you're making a decision around, okay, this is the service or the product or the problem that I'm trying to solve? And, you know, ultimately, they probably have a website already. But like, how do you then begin to explore all of these different ways in which you can interact with your customer or your user? And in all of these different scenarios, because it feels like such a huge, like wealth of opportunity, but you know, like, yeah, I just how do you how do you even start? Where do you start? Cheryl Platz:  It's a great question. And it is, it can be really daunting when you think about these things. One of my most popular medium posts was talking about how, you know, the the voice recognition scenarios that took off on Alexa were actually not new scenarios, they were things that were already supported on mobile, but they were scenarios that weren't convenient on mobile, they were things that when you looked at them in context, they were awkward, I have $1,000 phone, and yes, I can set timers on it. But if my hand is covered in butter, I don't want to touch the $1,000 phone. And so that's why I harp on context, so much like when you, when you observe your customers, when you listen to them, it may well be that the things they already have services for are still not meeting their needs in the right way. And we have these devices that are capable of so much but are not being used to their fullest capacity. And whether or not somebody is living with a disability, you know, the Microsoft inclusive design toolkit has a really useful sort of spectrum, both permanent and situational disabilities. And, you know, like, they sort of talk about like, well, if I'm holding a child, I'm going to situational disability where I don't have to use my arms, there's a lot of those things that crop up in real life. And so that's often kind of a key stone, like a keystone moment that can help you figure out where a multimodal interaction might make a lot of sense on a scenario that your customer already has had something for. But I'm really glad you brought up attention because attention is a really big part of this too. And one of the chapters in my book, I talk about something called an activity model. Because essentially, I don't know if any listeners have had this experience where like you try to open an app and the apps like I've got a load, so you're gonna have to wait. So you go try to do something else you pick, you open up an email, you start typing. And then after a few seconds, the app you tried to open is like I ready now and interrupt you in the middle of a sentence like shifts your focus over, which is rude. The computer knows you're in the middle of a sentence, like you're typing in, the computer has all of this information available, and yet, it's still bonks you back from the email to the original app, it should know better, but we've never taught our devices, what rude means or like what what, what a human activity means in which in which the context an interruption would be rude. And so that I talk about this this concept of activity modelling where we say like for your customer, in the situation you're trying to support with a product, what are the patterns of activity in which an interruption would be dangerous, or rude or acceptable. And, you know, then you take the different scenarios you're trying to support, you kind of matrix them on top of the activities, and you come up with patterns. So you're like this, in this situation when my customers running, I'm not going to use I'm not going to distract them too much, because I don't want to trip them. I'm going to use stuff that's very subtle, or I'm only going to use audio. I'm not going to try to get them to look at the watch. But if I know they're standing because we're not moving then I will use visual stimuli for example. Randy Silver:  I've got the sushi It's like a mix of Tron and inside out. And I've I've often consumed my devices to be annoying, but I'm not sure I ever considered them to be intentionally rude before and that's great. But you were talking about activity modelling and can we just go a little bit deeper on that because we've done lots of stuff in the past about story mapping and other kinds of journey, mapping types of things, but what is exactly is activity model. Cheryl Platz:  So activity modelling is you take a look at the patterns of behaviour for your customer in the space that your product touches now and you know, for a product like Alexa, this is very broad, because it's basically like all human behaviour in the home. But you know, in a workplace, it might be much more constrained to like the types of tasks person is conducting at their desk, for example, and you take a look at those, and you might you deconstruct that behaviour based on a couple of different guiding criteria. For me, it was things like, is this activity interruptible? Safely? Can I resume it later? And what is the cost of resuming it later, like is it you know, if I'm in the flow, and I'm writing something, and I get interrupted, it's going to take me extra time to get back in that flow. And so if I'm, if I'm interrupted, while writing a word, Doc, it's resumable. But at a cost. Whereas if I'm on a phone call, and I get interrupted, they potentially lost that context forever. The especially if it's like a conference call, but other people are continuing without me. So those are fundamentally two different types of activity patterns, one of them, my context, will be lost forever, when I get interrupted and the other one, I can, my context can get saved, but it's at a cost to me as a person. And then there are other activities where, or situations where maybe I, you know, it's very easy for me to pick up later, maybe I'm just filling out a form and it's all gonna be saved. There's no cost later, everything's perfect. And so then we have three activity patterns. And once it ends, there's no one right activity model, because it does depend on context. But I do propose, I do share kind of what I worked on, on Alexa notifications as sort of like a case study is like a baseline you can use. And so we had activity patterns, like, you know, the customer is, what when type of activity was short running tasks, so the customer is engaged with the product, doing something like getting a weather report, or getting the answer to a question. So those activities can be interrupted. But, you know, the cost of resuming them is so small that we don't even bother like saving your place. We're just like, hey, you know what, if you need to do this, again, you can just ask, versus a long running task, where it was something like listening to music, we can those those you could pause or attenuate, like, make the volume go down. So we, you know, the type of behaviour when we interrupted you was fundamentally different. You didn't have to lose the context of what you were doing when we interrupted you. And then we had live activities, which were things like phone calls, or like, if you were listening to like the like watching TV or something goes, you, you would lose the context if we interrupted you. And so whereas we might, if you had an incoming call, and we had the caller ID information, we would decide whether or not to announce the caller ID information based on what you were doing. So if you were listening to like the weather report, we might just interrupt it and say like, okay, it Cheryl Platz is calling because it's a time sensitive thing. And you can get the weather in five seconds, we don't want you to miss this call. If you're listening to music, we'll turn the volume down a little bit and say like, okay, Cheryl Potts is calling. And if you're on a phone call ready, we would just play the dial tone, or like we play the call, like the incoming call tone, but we wouldn't announce it because you're already in a situation where you would lose context if we interrupt you in that way. And then we provide you other ways to get that information if you needed it, like a banner, or like a notification on your phone. And so we're making we use that activity model to make informed choices about how we interrupt you in the moment. Lily Smith:  And so if I want to dive into this, if I think there are opportunities to, to go more multimodal. To go more multi modal. What's the word? Who do I need on my team like and what's the best place to get started? Like if I have a UX person who's you know, mainly sort of got experience and more like website UX? Is that sufficient? I mean, I guess it depends on the person. But how do we sort of start to dabble in multimodal products? Cheryl Platz:  So if you have a designer who is familiar with the interaction design side of UX, sometimes you have visual focus designers, and this may be a little bit more stretch for them. But if someone is an interaction designer, this is a set of skills or thinking that layers on top of their existing skillset. And we all start somewhere right. I did wasn't a multimodal designer. At first, we all start somewhere. And they'll also benefit from a very strong partnership with their product managers, their programme managers because there is a lot of technical constraints. There's a lot of complexity here. So you can go on this journey together, you can learn about your customers, there are some materials on my website that can help you get started with a shared understanding workshop where you kind of dump all your current understanding of your customer on the table, and then start working towards what gaps still exist in your understanding of your customer that you'll need to fill in order to plot your experience on the spectrum of model, multimodality and make informed decisions about maybe your activity model and what scenarios you need to support. Lily Smith:  And you have a really nice method of understanding your customers, which I hadn't come across before. That comes from your improv, I believe, called Yellow. Yes. Tell us about how that works. Cheryl Platz:  And so I can't take credit for the acronym itself. But the I the CRO acronym is a tool we use at my improv theatre that I've worked with for the past 13 years, I think unexpected productions in the Pike Place Market in Seattle, which is a fun fact is the reason the gum wall exists if you've ever heard of that, that's a fun rabbit hole we won't talk about here. But when you are a professional improv performer, there is a high bar for your narrative because people have paid to see you perform, and they expect you to tell interesting stories. And so we use this acronym to represent a reminder of what elements the human brain basically needs to find a story kind of compelling enough that it can fill in the blanks. And so I took that as an improviser, slash, sort of user researcher and stuff like that kind of also tells us a map of like what kind of represents a holistic scenario like a holistic person, a holistic environment. And so CRO stands for character, relationship objective, and where and so I've broken that down for folks both in Medium posts in my book, and in some materials on my website, so that you can use these four letters to kind of deconstruct your understanding of the customer and extend your existing customer outreach or your existing user, user experience, research practice. To get the additional context, you'll need to make informed multi-modality choices. So you know, really getting into the character part and understanding like, it's not just like asking, Hey, like, who are you? And how old are you? But like the difference between like, what is fundamental to them? Like what marginalised groups, or maybe they're a member of that are going to influence their ability to interact with some of these modalities? And what perceptions are they going to bring, that's going to influence their ability to like, what gestures they're going to think are appropriate, for example, to the relationship part, I talked about the importance of asking about human to human relationships, we talk a lot about human to computer and human to company. But do we talk about the relationship of your customer to the other customer, the people in their home, if they're sharing a phone, if they're sharing the computer that has an impact on, you know, if you're using the Alexa device, and you're switching profiles, that that adds a whole layer of complexity. If your kids using your phone that and switching profile that adds a whole layer of complexity. If you're trying to do a banking app on Alexa and they have someone in their home, they don't trust. That's a whole bunch of a deal-breaker. And so asking questions and broadening the way you ask questions really important. And then the objective part is a really good reminder for us all to make sure that we're grounding our user stories in real human needs. And not like tight, like, technical needs, you know, I don't browse a list of virtual machines. For fun, I browse it because I want to find the broken virtual machine and reboot it, for example, but and then the were really important for multi modality because it's usually the environment that's driving a lot of the changes customers make to, you know, whether they're interacting with a phone or a smart speaker or a desktop. It's like, where is the customer? What devices are in arm's reach? What distractions are their understanding that were more and not taking it for granted that like, everyone's home office is the same or everyone's kitchen is the same? We'll help you make the case to your stakeholders that it's worth investing in this extra technology. Because like when I worked on the Echo, look at Amazon, they were like, Well, why don't we just use an app? You know, just use a phone app for this. But when we talked to our customers, and we watched our customers, number one, their phones were with them when they were making clothing decisions. And number two, when they use both the voice and the app, they were like I'm not gonna buy this without voice using the app is awkward. They wanted it hands-free because they're getting dressed. They like that's a very, very hands-on experience. context matters. The wear matters, the Environment Matters. Randy Silver:  So Cheryl, thank you so much. We've really enjoyed all this. We're running out of time. So let's try and get one more question in. I'm just curious for people who are just getting started In working with a multimodal mindset and taking their products into a different space, What's the mistake you see people making? Most often, you know, what's the thing that we should all watch out for? Cheryl Platz:  Well, I think problems are sort of solutions in search of problems. Get falling in love with a particular input modality and coming up with something really complicated without it being grounded in a customer need is the biggest mistake I see. That's why I lead in my book with customer context first. If I know it's really exciting to try to like boil the ocean and put all the bells and whistles in, but at the end of the day, the simple solutions are often the best. And it can be an A lot of times it's a simple voice command that infers a lot of what you need from sensors and previous settings is a lot more valuable than a really well-executed multi-turn voice interaction that also supports gesture. So it's I think that's probably the biggest mistake I see is assuming that the end product has to be complicated. Or assuming that what people want is something showy and new. Randy Silver:  Cheryl, thank you so much for joining us. We really enjoyed this. You've got a whole bunch of resources on your website. We're going to link to them all in the show notes in the article. So if you want to follow up on Cheryl's book, see your talks Ted download at the worksheets please go there