Table of contents

Speech Recognition

%3 cluster_4b88b769_2ab7_4da7_896a_6fbefe60f860 Speech Recognition cluster_778ec8c4_58ce_4072_b290_05bd8dda25e7 Articles _6566bfdd_f9be_4f6b_99e3_0af776db2a86 OpenAI's Whisper _d1ca0b4f_0c97_4e5f_925e_d693a23f1540 A Brief Evaluation of (Mostly) Open Source Speech Recognition Packages - TechBeret Blog _61064182_5fe5_401b_ac88_f6c2e9a85108 KaldiASR _fc93fe70_bf25_40e0_9eba_6b893c067a07 Mozilla's DeepSpeech _343fe43b_f687_4f83_8171_c966a6887898 Speech Synthesis _343fe43b_f687_4f83_8171_c966a6887898->__0:cluster_4b88b769_2ab7_4da7_896a_6fbefe60f860 _65cb94af_150c_4b08_ad33_ab702cd1362a Introducing Whisper _65cb94af_150c_4b08_ad33_ab702cd1362a->_6566bfdd_f9be_4f6b_99e3_0af776db2a86

KaldiASR

oss

Mozilla's DeepSpeech

oss

  it tatatatatat young i started in two thousand iterating to mash it in
  london i say i think to moses fantasca change is the names of the complices
  every year men and so when when you're referring to the previous
  confinement quite remember what i would call and if he read my glove you
  should really now you consented essay the one check some of the title of
  her and then of the mutable and underways givin it in the future then way
  this is as comes he nerto thousand thirteen ten and francesca begod cook
  actually figured out now that francesca leap to the mean cook that was
  because he was a

  • Computation time

    ~1m

    • Seems to scale linearly with input. ~30s of input takes ~30s of processing

OpenAI's Whisper

oss

So far is a great, open-source (MIT licensed), Speech Recognition engine which is easy to install and use, and yields great results in reasonable times.

real	34m57,850s
user	62m10,509s
sys	0m57,646s
  [00:00.000 --> 00:26.000]  The started in 2013. I think code mesh in London. I say I think code mesh because Francesco
[00:26.000 --> 00:33.520]  changes the names of the conferences every year. When you're referring to a previous
[00:33.520 --> 00:39.280]  conference, you can never remember what it was called. If you read my blog, you should
[00:39.280 --> 00:44.360]  really name the conferences by the SHH1 check some of the title. Then it would be a
[00:44.360 --> 00:49.520]  mutable and you'd always be able to find it in the future. Anyway, this was code mesh
[00:49.520 --> 00:54.120]  in 2013. Francesco is a very good cook.
[00:54.120 --> 01:00.440]  Actually, the thing you don't know about Francesco is he's a mean cook because he was a short
[01:00.440 --> 01:07.400]  order cook in the university. I mean, he was certainly a bit of extra money in his spare time.