litbaza книги онлайнРазная литератураОхота на электроовец. Большая книга искусственного интеллекта - Сергей Сергеевич Марков

Шрифт:

-
+

Интервал:

-
+

Закладка:

Сделать
1 ... 446 447 448 449 450 451 452 453 454 ... 482
Перейти на страницу:
https://amhistory.si.edu/archives/speechsynthesis/dk_737a.htm

2369

Yoshimura T., Tokuda K., Masukoy T., Kobayashiy T., Kitamura T. (1999). Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis // http://www.sp.nitech.ac.jp/~zen/yossie/mypapers/euro_hungary99.pdf

2370

Imai S., Sumita K., Furuichi C. (1983). Mel Log Spectrum Approximation (MLSA) Filter for Speech Synthesis / Electronics and Communications in Japan, Vol. 66-A, No. 2, 1983 // https://doi.org/10.1002/ecja.4400660203

2371

Отрадных Ф. П. (1953). Эпизод из жизни академика А. А. Маркова // Историко-математические исследования. № 6. С. 495—508 // http://pyrkov-professor.ru/default.aspx?tabid=195&ArticleId=44

2372

Chen S.-H., Hwang S.-H., Wang Y.-R. (1998). An RNN-based prosodic information synthesizer for Mandarin text-to-speech / IEEE Transactions on Speech and Audio Processing, Vol. 6, No. 3, pp. 226—239 // https://doi.org/10.1109/89.668817

2373

Zen H., Senior A., Schuster M. (2013). Statistical parametric speech synthesis using deep neural networks / Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013 // https://doi.org/10.1109/ICASSP.2013.6639215

2374

Kang S., Qian X., Meng H. (2013). Multi-distribution deep belief network for speech synthesis / Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013 // https://doi.org/10.1109/ICASSP.2013.6639225

2375

Ling Z.-H., Deng L., Yu D. (2013). Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis / IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21(10), pp. 2129—2139 // https://doi.org/10.1109/tasl.2013.2269291

2376

Lu H., King S., Watts O. (2013). Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis / Proceedings of the 8th ISCASpeech Synthesis Workshop (SSW), 2013 // http://ssw8.talp.cat/papers/ssw8_PS3-3_Lu.pdf

2377

Qian Y., Fan Y., Hu W., Soong F. K. (2014). On the training aspects of deep neural network (DNN) for parametric TTS synthesis / Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014 // https://doi.org/10.1109/ICASSP.2014.6854318

2378

Fan Y., Qian Y., Xie F., Soong F. K. (2014). TTS synthesis with bidirectional LSTM based recurrent neural networks / Interspeech 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14—18, 2014 // https://www.isca-speech.org/archive/archive_papers/interspeech_2014/i14_1964.pdf

2379

Fernandez R., Rendel A., Ramabhadran B., Hoory R. (2015). Using Deep Bidirectional Recurrent Neural Networks for Prosodic-Target Prediction in a Unit-Selection Text-to-Speech System / Interspeech 2015, 16th Annual Conference of the International Speech Communication Association, 2015 // https://www.isca-speech.org/archive/interspeech_2015/i15_1606.html

2380

Wu Z., Valentini-Botinhao C., Watts O., King S. (2015). Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis / Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 // https://doi.org/10.1109/ICASSP.2015.7178814

2381

Zen H. (2015). Acoustic Modeling in Statistical Parametric Speech Synthesis — From HMM to LSTM-RNN / Proceedings of the First International Workshop on Machine Learning in Spoken Language Processing (MLSLP2015), Aizu, Japan, 19–20 September 2015 // https://research.google/pubs/pub43893/

2382

Merritt T., Clark R. A. J., Wu Z., Yamagishi J., King S. (2016). Deep neural network-guided unit selection synthesis / 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) // https://doi.org/10.1109/ICASSP.2016.7472658

2383

Holschneider M., Kronland-Martinet R., Morlet J., Tchamitchian P. (1989). A real-time algorithm for signal analysis with the help of the wavelet transform / Combes J.-M., Grossmann A., Tchamitchian P. (1989). Wavelets: Time-Frequency Methods and Phase Space. Springer Berlin Heidelberg // https://books.google.ru/books?id=3R74CAAAQBAJ

2384

Dutilleux P. An implementation of the “algorithme a trous” to compute the wavelet transform / Combes J.-M., Grossmann A., Tchamitchian P. (1989). Wavelets: Time-Frequency Methods and Phase Space. Springer Berlin Heidelberg // https://books.google.ru/books?id=3R74CAAAQBAJ

2385

Yu F., Koltun V. (2016). Multi-scale context aggregation by dilated convolutions / http://arxiv.org/abs/1511.07122

2386

Chen L.-C., Papandreou G., Kokkinos I., Murphy K., Yuille A. L. (2015). Semantic image segmentation with deep convolutional nets and fully connected CRFs // http://arxiv.org/abs/1412.7062

2387

van den Oord A., Dieleman S., Zen H., Simonyan K., Vinyals O., Graves A., Kalchbrenner N., Senior A., Kavukcuoglu K. (2016). WaveNet: A generative model for raw audio // https://arxiv.org/pdf/1609.03499.pdf

2388

van den Oord A., Dieleman S. (2016). WaveNet: A generative model for raw audio // https://deepmind.com/blog/article/wavenet-generative-model-raw-audio

2389

van den Oord A., Li Y., Babuschkin I., Simonyan K., Vinyals O., Kavukcuoglu K., van den Driessche G., Lockhart E., Cobo L. C., Stimberg F., Casagrande N., Grewe D., Noury S., Dieleman S., Elsen E., Kalchbrenner N., Zen H., Graves A., King H., Walters T., Belov D., Hassabis D. (2017). Parallel WaveNet: Fast High-Fidelity Speech Synthesis // https://arxiv.org/abs/1711.10433

2390

Jin Z., Finkelstein A., Mysore G. J., Lu J. (2018). FFTNet: A Real-Time Speaker-Dependent Neural Vocoder / 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) // https://doi.org/10.1109/ICASSP.2018.8462431

2391

Kalchbrenner N., Elsen E., Simonyan K., Noury S., Casagrande N., Lockhart E., Stimberg F., van den Oord A., Dieleman S., Kavukcuoglu K. (2018). Efficient Neural Audio Synthesis // https://arxiv.org/abs/1802.08435

2392

Prenger R., Valle R., Catanzaro B. (2018). WaveGlow: A Flow-based Generative Network for Speech Synthesis // https://arxiv.org/abs/1811.00002

2393

Valin J.-M., Skoglund J. (2018). LPCNet: Improving Neural Speech Synthesis Through Linear Prediction // https://arxiv.org/abs/1810.11846

2394

Govalkar P., Fischer J., Zalkow F., Dittmar C. (2019). A Comparison of Recent Neural Vocoders for Speech Signal Reconstruction / 10th ISCA Speech Synthesis Workshop, 20—22 September 2019, Vienna, Austria // https://doi.org/10.21437/SSW.2019-2

2395

Wang Y., Skerry-Ryan RJ, Stanton D., Wu Y., Weiss

1 ... 446 447 448 449 450 451 452 453 454 ... 482
Перейти на страницу:

Комментарии
Минимальная длина комментария - 20 знаков. Уважайте себя и других!
Комментариев еще нет. Хотите быть первым?