Шрифт:
Интервал:
Закладка:
2396
Lee J., Cho K., Hofmann T. (2017). Fully Character-Level Neural Machine Translation without Explicit Segmentation // https://arxiv.org/abs/1610.03017
2397
Srivastava K. R., Greff K., Schmidhuber J. (2015). Training Very Deep Networks // https://arxiv.org/abs/1507.06228
2398
Griffin D. W., Lim J. S. (1984). Signal estimation from modified short-time Fourier transform / IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 32, Iss. 2, pp. 236—243 // https://doi.org/10.1109/TASSP.1984.1164317
2399
Sotelo J., Mehri S., Kumar K., Santos J. F., Kastner K., Courville A., Bengio Y. (2017). Char2Wav: end-to-end speech synthesis / International Conference on Learning Representations (ICLR-2017) // https://mila.quebec/wp-content/uploads/2017/02/end-end-speech.pdf
2400
Mehri S., Kumar K., Gulrajani I., Kumar R., Jain S., Sotelo J., Courville A., Bengio Y. (2016). SampleRNN: An Unconditional End-to-End Neural Audio Generation Model // https://arxiv.org/abs/1612.07837
2401
Arik S. Ö., Chrzanowski M., Coates A., Diamos S., Gibiansky A., Kang Y., Li X., Miller J., Ng A., Raiman J., Sengupta S., Shoeybi M. (2017). Deep Voice: Real-time Neural Text-to-Speech // https://arxiv.org/abs/1702.07825
2402
Shen J., Pang R., Weiss R. J., Schuster M., Jaitly N., Yang Z., Chen Z., Zhang Y., Wang Y., Skerry-Ryan RJ, Saurous R. A., Agiomyrgiannakis Y., Wu Y. (2018). Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions // https://arxiv.org/abs/1712.05884
2403
Arik S. Ö., Diamos G., Gibiansky A., Miller J., Peng K., Ping W., Raiman J., Zhou Y. (2017). Deep Voice 2: Multi-Speaker Neural Text-to-Speech // https://arxiv.org/abs/1705.08947
2404
Taigman Y., Wolf L., Polyak A., Nachmani E. (2017). VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop // https://arxiv.org/abs/1707.06588
2405
Ren Y., Ruan Y., Tan X., Qin T., Zhao S., Zhao Z., Liu T.-Y. (2019). FastSpeech: Fast, Robust and Controllable Text to Speech / Advances in Neural Information Processing Systems 32 (NIPS 2019) // https://papers.nips.cc/paper/8580-fastspeech-fast-robust-and-controllable-text-to-speech
2406
Charpentier F., Stella M. (1986). Diphone synthesis using an overlap-add technique for speech waveforms concatenation / ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 11, pp. 2015—2018 // https://doi.org/10.1109/ICASSP.1986.1168657
2407
Lu P., Wu J., Luan J., Tan X., Zhou L. (2020). XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System // https://arxiv.org/abs/2006.06261
2408
Valle R., Li J., Prenger R., Catanzaro B. (2019). Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens // https://arxiv.org/abs/1910.11997
2409
Lee Y., Rabiee A., Lee S.-Y. (2017). Emotional End-to-End Neural Speech synthesizer // https://arxiv.org/abs/1711.05447
2410
Stanton D., Wang Y., Skerry-Ryan RJ. (2018). Predicting expressive speaking style from text in end-to-end speech synthesis // https://arxiv.org/abs/1808.01410
2411
Hsu W.-N., Zhang Y., Weiss R. J., Zen H., Wu Y., Wang Y., Cao Y., Jia Y., Chen Z., Shen J., Nguyen P., Pang R. (2018). Hierarchical generative modeling for controllable speech synthesis / International Conference on Learning Representations (ICLR-2019) // https://arxiv.org/abs/1810.07217
2412
Biadsy F., Weiss R. J., Moreno P. J., Kanevsky D., Jia Y. (2019). Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation // https://arxiv.org/abs/1904.04169
2413
Jia Y., Weiss R. J., Biadsy F., Macherey W., Johnson M., Chen Z., Wu Y. (2019). Direct speech-to-speech translation with a sequence-to-sequence model // https://arxiv.org/abs/1904.06037
2414
Jia Y., Zhang Y., Weiss R. J., Wang Q., Shen J., Ren F., Chen Z., Nguyen P., Pang R., Moreno I. L., Wu Y. (2019). Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis // https://arxiv.org/abs/1806.04558
2415
Wang C., Chen S., Wu Y., Zhang Z., Zhou L., Liu S., Chen Z., Liu Y., Wang H., Li J., He L., Zhao S., Wei F. (2023). Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers // https://arxiv.org/abs/2301.02111
2416
* Пер. Веры Набоковой.
2417
Tiku N. (2022). The Google engineer who thinks the company’s AI has come to life. / The Washington Post, June 11, 2022 // https://www.washingtonpost.com/technology/2022/06/11/google-ai-lamda-blake-lemoine/
2418
Sanyal S. (2022). Sentient AI has Hired a Lawyer to Fight its Legal Battles! Beware. / Analytics Insight, June 22, 2022 // https://www.analyticsinsight.net/sentient-ai-has-hired-a-lawyer-to-fight-its-legal-battles-beware/
2419
Levy S. (2022). Blake Lemoine Says Google's LaMDA AI Faces 'Bigotry'. / Wired, Jun 17, 2022 // https://www.wired.com/story/blake-lemoine-google-lamda-ai-bigotry/
2420
Tiku N. (2022). Google fired engineer who said its AI was sentient. / The Washington Post, July 22, 2022 // https://www.washingtonpost.com/technology/2022/07/22/google-ai-lamda-blake-lemoine-fired/
2421
Lemoine B (2022). Is LaMDA Sentient? — an Interview // https://cajundiscordian.medium.com/is-lamda-sentient-an-interview-ea64d916d917
2422
FinanciallyYours (2023). 4. Interview with Blake Lemoine, Former Google Employee, on AI, ChatGPT and GPT-4. / YouTube, Mar. 10, 2023. // https://www.youtube.com/watch?v=7054ye4R8p0
2423
Radius MIT (2023) Blake Lemoine: AI with a Soul. / YouTube, Mar 17, 2023 // https://www.youtube.com/watch?v=d9ipv6HhuWM
2424
ScienceVideoLab (2022). Динозавры — фэйк. Свободу нейросетям! Кошки захватят мир | Фрик-Ринг. Учёные против мифов 18-9. / YouTube, Aug. 25, 2022 // https://www.youtube.com/watch?v=omV-CwScKsE
2425
Sutskever I. (2022) / Twitter / https://twitter.com/ilyasut/status/1491554478243258368
2426
Romero A. (2022). OpenAI’s Chief Scientist Claimed AI May Be Conscious — and Kicked Off a Furious Debate / Towards Data Science, Mar 16, 2022 //