Microsoft Speech Server and Speech Technologies … What’s all the talk about?

According to Microsoft and its plans for the technologies:

  • Customers Reap Business Value Deploying Microsoft-Based Speech Applications
  • Partners Create Winning Microsoft Speech Server Solutions
  • Developers Can Speech-Enable Web Applications Quickly and Easily  

So what does this all mean? Well, the days of ‘talking computers’ is almost here. Scary? No, just reality. Speech enabled technologies have actually been around along time, but never did it have the support or the life that Microsoft is attempting to breathe into it… and it makes sense. Why type for 9 hours when you can ‘speak’.

Chairman and Chief Software Architect Bill Gates launched Microsoft Speech Server 2004 and spoke about the developer opportunities and business value Speech Server will enable, which analysts say will change the industry dynamic.

“For years now, this technology has been accessible only to a short list of Fortune 500 companies because it has been so difficult and expensive to implement,” Lee said. “Both large and midsize companies need a lower cost of entry and lower total cost of ownership. A key value of Speech Server is to dramatically reduce the cost and complexity of developing and deploying speech applications, making the technology more accessible to a broader range of enterprise customers. This is a Microsoft value proposition that we’re delivering today with Microsoft Speech Server 2004 and the broad ecosystem of partners that provide complete solutions, integration services and a variety of other services for customers.”

Speech Impediment?

The challenges of ‘Voice and Speech Recognition’ are much cause for unclear articulation… Microsoft is addressing all challenges faced as they come. The most common of problems that have haunted all other attempts on systems to date are:

  • Its not standardized, mostly proprietary
  • Technology has not been easy to adapt or deploy
  • Humans do not speak the same way, many use slang, other lingo
  • Language barriers
  • Coding and software issues
  • Background noise and alter the ‘sound’ the computer accepts
  • Standard IO peripherals simple to deploy (mouse, keyboard)
  • Currently still in a ‘niche’ market  

Advanced Speech Recognition Benefits

Microsoft is breaking new ground in the speech industry by becoming the first company to offer a single platform that combines:

  • Web technologies
  • Speech-processing services
  • Telephony capabilities

Once opened up to developers, then another market will open as vendors start to make ‘speech aware’ applications. Can’t you just see it now? What do you want for the holidays next year, a new Wireless Keyboard or a Microphone? If this takes off, then you can assume that web facing and all other technologies would also have to standardize speech technologies… it may create a new way to operate and do work (ATM’s, Kiosks)… it could revolutionize the way we work not only in technology, but in other areas. What about the handicapped? This will also open new portals for them as well; enabling speech technology into affordable solutions will definitely cause some kind of landscape change in the next 10 years if successful. Thank about going to work and having a headset on that you basically roam to your cell phone on and then back in your SOHO back home. I am not saying that this is a solution today, but like everything else (Wireless, etc), it wouldn’t surprise me in the least bit. Imagine telephones, mobile phones, Pocket PCs, Smartphones, laptops, blackberries and so on all ‘speech aware’?

Security appliance vendors and security software coders would also have to consider this technology as well.

Machines communicating like humans, Reality?

Yes. Vendors are working to create ‘communication tools’ that will allow an email to change to a voice conversation or vice versa as an example. Home Security Systems connected to your PC can be turned on and off with your voice, maybe even ‘only your voice’ for security reasons. As you can see… things can (and will) change as long as technology develops. This all depends (of course) on how stable the technology is and how much it costs and so on. Released also this year was information about Microsoft’s move to upgrade the XBOX, who wants to bet with its advancements in the ‘gaming’ space, no doubt will Microsoft look to take over the phone and all other things in the home as well. This puts just about everything in your house in a state of ‘possible convergence’. Speech technologies may be very cool when your XBOX is the only one on the block that you can ‘talk too’. J I don’t know about you, but I can’t wait until my video games start to talk to me, then maybe XBOX may become our next Virtual Reality Play-toy as Microsoft also makes movement into space on the TV and so on.

Most of this stuff I am speculating on a bit, but through my over-vivid imagination and with the checking of some sources, its safe to say that all of these things are likely in development or are in the engineering phase.

Who else is Talking?

Others are also in the Speech Market as well. IBM is another large player with Speech Technologies already underway. IBM is looking to also become a major player in the arena with plans for offerings coming out for the rest of this decade. One such application is ‘Computer Kiosks’ that help to translate from one language are under development as we speak (no pun intended). 


In this article we looked at the Microsoft ‘Speech’ Services and Technologies set to hit this year and continue to grow (hopefully) quickly as this is a worthwhile technology that really works to make life ‘easier’, which in essence provides more quality time. Speaking instead of typing will shorten your day, Speech technologies seem to be a win-win solution… do you disagree? Let’s hear what you think in the Forum?

In future articles (as we beta test Speech Server 2004), you will learn the technical intricacies. Stay tuned for our next article, while gets under the hood of speech technologies to see how it all works.

Our next article on speech will cover VoiceXML; xHTML and SALT, which stands for Speech Application Language Tags. The Following Article will cover the installation and configuration of the Speech Server and how to use it.

Stay tuned.

Links and References

Speech Home

Evaluation Kit Order Form

Starter Kit  

Microsoft Brings Vision for Mainstream Speech Technology to Life With Launch of Microsoft Speech Server 2004

(Comparison) Apple’s Speech Recognition

About The Author

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top