Шрифт:
Интервал:
Закладка:
3299
Kutalev A., Lapina A. (2021). Stabilizing Elastic Weight Consolidation method in practical ML tasks and using weight importances for neural network pruning // https://arxiv.org/abs/2109.10021
3300
Kutalev A. (2020). Natural Way to Overcome the Catastrophic Forgetting in Neural Networks // https://arxiv.org/abs/2005.07107
3301
Metz L., Maheswaranathan N., Freeman C. D., Poole B., Sohl-Dickstein J. (2020). Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves // https://arxiv.org/abs/2009.11243
3302
Baydin A. G., Pearlmutter B. A., Syme D., Wood F., Torr P. (2022). Gradients without Backpropagation // https://arxiv.org/abs/2202.08587
3303
Schlag I., Sukhbaatar S., Celikyilmaz A., Yih W.-t., Weston J., Schmidhuber J., Li X. (2023). Large Language Model Programs // https://arxiv.org/abs/2305.05364
3304
Sapunov G. (2023). Large Language Model Programs. A useful conceptualization for a wide set of practices for working with LLMs // https://gonzoml.substack.com/p/large-language-model-programs
3305
Schreiner M. (2022). Meta’s AI chief: Three major challenges of artificial intelligence / MIXED, Jan 29 2022 // https://mixed-news.com/en/metas-ai-chief-three-major-challenges-of-artificial-intelligence/
3306
LeCun Y. (2022). A Path Towards Autonomous Machine Intelligence // https://openreview.net/forum?id=BZ5a1r-kVsf
3307
Assran M., Duval Q., Misra I., Bojanowski P., Vincent P., Rabbat M., LeCun Y., Ballas N. (2023). Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture // https://arxiv.org/abs/2301.08243
3308
Dickson B. (2020). The GPT-3 economy / TechTalks, September 21, 2020 // https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/
3309
Asimov A. (2016). Foundation and Earth. HarperCollins Publishers // https://books.google.ru/books?id=0DW0rQEACAAJ
3310
* Пер. А. Ливерганта.
3311
Athalye A., Engstrom L., Ilyas A., Kwok K. (2017). Fooling Neural Networks in the Physical World with 3D Adversarial Objects // https://www.labsix.org/physical-objects-that-fool-neural-nets/
3312
Athalye А., Carlini N., Wagner D. (2018). Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples // https://arxiv.org/abs/1802.00420
3313
Athalye A., Carlini N., Haddad D., Patel S. (2018). Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples // https://github.com/anishathalye/obfuscated-gradients
3314
Athalye A., Engstrom L., Ilyas A., Kwok K. (2017). Synthesizing Robust Adversarial Examples // https://arxiv.org/abs/1707.07397
3315
Bourdakos N. (2017). Capsule Networks Are Shaking up AI — Here’s How to Use Them / Hackernoon, November 9th 2017 // https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
3316
Sabour S., Frosst N., Hinton G. E. (2017). Dynamic Routing Between Capsules // https://arxiv.org/abs/1710.09829
3317
Tolstikhin I., Houlsby N., Kolesnikov A., Beyer L., Zhai X., Unterthiner T., Yung J., Steiner A., Keysers D., Uszkoreit J., Lucic M., Dosovitskiy A. (2021). MLP-Mixer: An all-MLP Architecture for Vision // https://arxiv.org/abs/2105.01601
3318
Liu H., Dai Z., So D. R., Le Q. V. (2021). Pay Attention to MLPs // https://arxiv.org/abs/2105.08050
3319
Li D., Hu J., Wang C., Li X., She Q., Zhu L., Zhang T., Chen Q. (2021). Involution: Inverting the Inherence of Convolution for Visual Recognition // https://arxiv.org/abs/2103.06255
3320
Hidalgo C. (2015). Why Information Grows: The Evolution of Order, from Atoms to Economies. Hachette UK // https://books.google.ru/books?id=0984DgAAQBAJ
3321
Swaminathan S., Garg D., Kannan R., Andres F. (2020). Sparse low rank factorization for deep neural network compression / Neurocomputing, Vol. 398, pp. 185—196 // https://doi.org/10.1016/j.neucom.2020.02.035
3322
Wu M., Parbhoo S., Hughes M. C., Roth V., Doshi-Velez F. (2019). Optimizing for Interpretability in Deep Neural Networks with Tree Regularization // https://arxiv.org/abs/1908.05254
3323
Akhtar N., Jalwana M., Bennamoun M., Mian A. S. (2021). Attack to Fool and Explain Deep Networks / IEEE Transactions on Pattern Analysis and Machine Intelligence, 26 May 2021 // https://doi.org/10.1109/TPAMI.2021.3083769
3324
Lang O., Gandelsman Y., Yarom M., Wald Y., Elidan G., Hassidim A., Freeman W. T., Isola P., Globerson A., Irani M., Mosseri I. (2021). Explaining in Style: Training a GAN to explain a classifier in StyleSpace // https://arxiv.org/abs/2104.13369
3325
Rogers A., Kovaleva O., Rumshisky A. (2020). A Primer in BERTology: What we know about how BERT works // https://arxiv.org/abs/2002.12327
3326
Geva M., Schuster R., Berant J., Levy O. (2020). Transformer Feed-Forward Layers Are Key-Value Memories // https://arxiv.org/abs/2012.14913
3327
Meng K., Bau D., Andonian A., Belinkov Y. (2022). Locating and Editing Factual Associations in GPT // https://arxiv.org/abs/2202.05262
3328
Eldan R., Russinovich M. (2023). Who's Harry Potter? Approximate Unlearning in LLMs // https://arxiv.org/abs/2310.02238
3329
Li K., Patel O., Viégas F., Pfister H., Wattenberg M. (2023). Inference-Time Intervention: Eliciting Truthful Answers from a Language Model // https://arxiv.org/abs/2306.03341
3330
Zou A., Phan L., Chen S., Campbell J., Guo P., Ren R., Pan A., Yin X., Mazeika M., Dombrowski A.-K., Goel S., Li N., Byun M. J., Wang Z., Mallen A., Basart S., Koyejo S., Song D., Fredrikson M., Kolter J. Z., Hendrycks D. (2023). Representation Engineering: A Top-Down Approach to AI Transparency // https://arxiv.org/abs/2310.01405
3331
Gurnee W., Tegmark M. (2023). Language Models Represent Space and Time // https://arxiv.org/abs/2310.02207
3332
*