VoiceMarketing

VoiceMarketing

Doug Schumacher

VoiceMarketing is a podcast covering the latest tactics and trends in the marketing of voice apps and audio products.

28 minutes 32 seconds

Speed Listening at 900 words per minute

My guest this week is Sina Barham, founder of Prima Access Computing.

Show Notes
Guest
Sina Bahram, Founder and President of Prime Access Computing

Host
Host: Doug Schumacher
Website: Arrovox.com/VoiceMarketing
Twitter: @MemeRunner

Prime Access Computing Website
https://pac.bz

Transcript
Intro
Doug:

Welcome to VoiceMarketing. I’m Doug Schumacher, and on today’s episode, we have a very special guest. Sina Barham.

Sina does what you might call speed listening using a TTS engine. And he’s getting up to some pretty remarkable speeds, which we’re going to go through in the podcast.

It’s a fascinating discussion about speed listening with TTS engines, voice assistant UX, and the future of audio as the primary format for content consumption.

Sina’s going to give a brief overview of his background, but a couple of things that he didn’t mention that I want to, are,

In 2012, Sina was recognized as a White House Champion of Change by President Barack Obama, for his work enabling users with disabilities to succeed in the STEM fields of Science, Technology, Engineering, and Math.

In 2015, Sina was recognized as an Emerging Leader in Digital Accessibility at the annual Knowbility Heroes of Accessibility Awards.

In 2017, Sina served as co-chair of the Museums and the Web conference. And in 2019, Sina started serving as an invited expert on the W3C ARIA working group.

So let’s get right into it. Here’s my conversation with Sina Barham.

Interview
VoiceMarketing.m4a

Doug: Sina,

Sina: Hi how’s it going.

Doug Hey good to talk to you too. So let’s go ahead of giving people a quick overview of you.

But why don’t you introduce yourself and give us a quick quick background of your history your education work all that stuff.

Sina; Sure. I run a company called prime access consulting or PAC Casey for short. We’re based out of Research Triangle Park in North Carolina.

And Kerry and a lot of what we do is focus on inclusive design and accessibility is one of those outcomes of inclusive design so we look at processes whether it’s software whether it’s inside of a museum or university or startup. What have you. On how we can make things available to the widest possible audience. That means things like tech signs that means things like making sure programs like a screen reader can access content but it also means you know physical considerations for getting around an exhibit or getting around a service offering from from a company and just really trying to work with designers developer boards of trustees. What have you on making sure that these various afford says we’re designing and we’re putting on into the world are as accessible as possible. I got into this work because of my background in computer science so my undergraduate and graduate degrees are in computer science. I’m a BD and a Ph.D. in computer science and in a field called Human Computer Interaction and. I happen to be blind. So a lot of my time in grad school was spent at the beginning not wanting to go into accessibility not wanting to go into inclusive design because I kind of felt it was pretty stereotypical as the blind guy to go into accessibility I wanted to do a lot of other things but I got really frustrated with a lot of the tools I had to use and a lot of the various things that people were claiming were accessible and in fact really were not where I knew the technology to be able to be or where I knew the software capabilities were what we’re able to achieve. And so I started getting into designing solutions in the space and then brought in that not only for persons with disabilities or people who use assistive technologies but really for everyone so that we can make experiences that are delightful and enjoyable by as many people as possible because that’s really fantastic.

Doug: You mentioned a little bit about the the tools and the interface and so on and I want to come back and talk about that a little bit but the thing that I want to start with is the video that I saw of you on YouTube.

I believe it’s from 2011. But what what really captured my heart my attention on that was you were you were consuming text at edit incredible rate. I mean what I was hearing was just a literally an audio blur going past and you were apparently passing that and comprehending what was being said. And so you want to give a little setup to that what that was about and then we’ll go into that from there.

Sina: Sure. First of all 2011 seems like a century ago. So it’s just amazing to think about that.

And I was using a program called a screenwriter at that time so I mentioned that I happen to be blind which which means that I’m not able to see what’s on the screen but I’m still able to program and access email and do all of these various things access to command line and so the the way that I do that is with a program called the screen reader. It uses a variety of programmatic interfaces it’s not just you know doing OCR it’s not doing optical character recognition on the screen. It’s actually programmatically getting the content of these various applications whether it’s something like Outlook PowerPoint or Microsoft Word or something like Firefox or Chrome on the web cetera. And it is able to then turn that information into speech. Now this is pretty straightforward. But the thing that caught your caught your ear was the fact that I had the speech cranked up quite a bit. I believe in the video that you’re referencing. You know I was listening to it about eight hundred nine hundred words per minute. Something around there maybe 950. Yeah. Yeah. That’s you know you can think of that like roughly six or seven acts human regular conversation like like we’re talking right now at about 150 hundred sixty I tend to talk a little fast. Average speaking rates right around there. And that’s that program that screen reader is driving something called a text to speech engine or a T. Or just think of it as a voice and there’s tons of different voices out there. I tend to use a pretty robotic sounding voice. I mean it really sounds like it’s out of the 1980s and a lot of ways it is but because it’s mathematically generated it’s not a recording of someone in a studio where those phones have been stitched together it’s actually mathematically generated as a result you can crank it as fast as you can understand. And I’m fortunate enough through my life to have trained myself to listen to it very quickly as have many other screen reader users and that’s what we’re hearing.

Doug: So. So do you have a clip of that you can play for us because I think that’ll be really fascinating.

Sina: Yeah I can I can. What I’ll do here is I’ll just have my computer see some stuff out loud and we’ll take a listen to that. Great.

All right. So there’s that.

For example if we wanted to do like here’s the here’s the time that the current time right now.

Right. So three twenty two we can. We know we can slow this down.

Right. So slow down a little bit more slower frequently. Right. That’s starting to become understandable right. Slower frequently. Right. And then all I’m doing is like that.

And so what is that speed right there.

That’s what I would say that’s about a six to seven x so that’s it that’s about a thousand words from it but between nine hundred and a thousand depending you can process it.

Doug: No problem. Yeah for sure.

Sina: Now I might slow it down you know if I’m reading you know a contract or something like that and like for example you’re like I’ll use the interface right now to route the audio back to my to my headphones I’ll use it at the speed that I’ve got it out because more of a navigational test I’ll do something like this. Great sound card window’s default. Right. And now it’s back in my headphones so that I’m able to understand fine but if I’m reading a book for pleasure or if I’m reading a contract with legal language in it especially if it’s in a spreadsheet with numbers I will slow down but I won’t slow down anywhere to that you know the speed that you’ve you were listening to when I when I had it all the way low I might just slow down a few a few ticks if you will. So like call it 700 words or 600. You know it’s like really really relaxed. Right.

Doug: And that’s how anybody would read printed text as well. The more contracts and technical things you tend to read a lot slower and for more of the leisure and long format stuff. Right.

To accelerate your reading speed. So yeah that’s that’s really fascinating. Now are you. Are you using so with the with the text to speech generator you’ve got on your computer are you reading primarily from when you’re surfing the web or you’re reading primarily from ACA compliant Web sites or are you processing pretty much everything.

Well that’s a that’s a it’s an intricate question right. So when you say an ADA compliannt Web site you know you’re referring to something called Web site accessibility or the fact that a web page is able to be accessed in a better way with assistive technologies and there’s no way that you do that.

Is that the developers and the designers have have followed something called the Web Content Accessibility Guidelines or weak WC first report and that’s really important because what it is is a set of principles a set of rules that really are the things that everybody should be doing on a Web site making sure he has available apps not hiding it in an image that sort of thing. And so it doesn’t mean that Web sites that are not like that cannot be gotten through. It just means that if these accessibility considerations are not taken then it’s a lot more arduous a lot more difficult. So imagine like an academic portal or a news Web site with a ton of links on it and no headings on the page then you can’t use those headings to navigate around. So you have to hear all those links or figure out a way of skipping past them. So it’s those kinds of things that you end up doing. You can still sort of get to th

27 August 2019, 12:50 pm
8 minutes 9 seconds

Writing for Voice and Audio

9 points, including navigation vs content, adding pauses for impact, and varying sentence length.

9 April 2019, 8:10 am
12 minutes 45 seconds

Content Ideation for Voice

A 7 stage process for voice content ideation

1. Goals and objectives
2. Method of evaluation
3. Resource inventory
4. Personas
5. Needs segmentation grid
6. Concepting
7. Editorial calendar

Credits
Writer/Producer: Doug Schumacher
www.dougschumacher.com
www.Twitter.com/MemeRunner
Production Company: Arrovox
www.Arrovox.com
www.Twitter.com/Arrovox_

19 March 2019, 1:00 pm
41 minutes 33 seconds

Working with Voice Actors

An interview with David Ciccarelli, CEO of Voices.com, where we discuss ways to get the right voice for your brand, tips for getting better results from your voice sessions, and how technology is changing the voiceover industry.

14 February 2019, 9:17 pm
22 minutes 28 seconds

How listenable is synthetic speech?

When is the right time and the wrong time to use synthetic speech in a voice experience? In this episode we listen to synthetic voices reading 6 types of voice script content, and see how various content types compare.

19 September 2018, 1:00 pm
19 minutes 26 seconds

Voice App Naming

The name you choose for any product or company is the most important branding decision you'll make. In this podcast episode, we look at a number of critical considerations, including legal, platform and branding issues.

30 August 2018, 9:11 pm
12 minutes 54 seconds

Synthetic Voice Personality Parameters

Listen as we delve into the voice personality parameters you can use to shape synthetic voices. A goal of this process is to achieve a voice with personality traits that will be most persuasive with your users or target audience.

We'll use an example of Darth Vader as a target voice personality, and play with different synthetic voice parameters to get as close as possible to the essence of that voice quality, using SSML with Amazon Polly.

Episode Blog Post
http://arrovox.com/2018/07/31/ep-02-synthetic-…ality-parameters/

Resources
"Wired for Speech", by Clifford Nass
https://www.amazon.com/Wired-Speech-Activates-Human-Computer-Relationship-ebook/dp/B001949SMM/

Credits
Host: Doug Schumacher
Twitter: @MemeRunner
Production: Arrovox.com
Contact: [email protected]

31 July 2018, 1:15 pm
14 minutes 17 seconds

The Wiggins Voice Personality Model

In this episode we look at how Stanford professor of Communications and voice personality researcher Clifford Nass employed the personality model developed by psychologist Jerry Wiggins to guide voice designers in developing more persuasive voices.

We'll also map three financial services brands to this model: eTrade, Charles Schwab and Vanguard.

Episode blog post
http://arrovox.com/2018/07/31/ep-01-wiggins-voice-personality-model/

Resources
"Wired for Speech", by Clifford Nass
https://www.amazon.com/Wired-Speech-Activates-Human-Computer-Relationship-ebook/dp/B001949SMM/

Credits
Host: Doug Schumacher
Twitter: @MemeRunner
Production: Arrovox.com
Contact: [email protected]

31 July 2018, 1:00 pm
More Episodes? Get the App

About VoiceMarketing

Links

Listeners Also Subscribed To

Your feedback is valuable to us. Should you encounter any bugs, glitches, lack of functionality or other problems, please email us on [email protected] or join Moon.FM Telegram Group where you can talk directly to the dev team who are happy to answer any queries.