复数神经网络
【神经网络|复数神经网络_如果神经网络状态为复数会发生什么()】It’s a question currently on the edge of research in neural networks: what were to happen if neural network states were complex numbers? Since weights, biases, and other network components would have both a real and an imaginary component, a once one-dimensional measure would have the freedom to move in two dimensions.
当前在神经网络研究的边缘存在一个问题:如果神经网络状态是复数,将会发生什么? 由于重量,偏差,以及其他网络组件将同时具有实部和一个虚分量,一次一维措施将具有在两个维度移动的自由。
It’s a fascinating idea to ponder. Theoretically, if one were able to communicate two values in one, the neural network might be able to find some higher-dimensional, faster solution to an outcome, much like how you would be able to jump over a rock in three dimensions but an ant, seeing the wall ahead that seems to extend up forever, would need to go around it.
这是一个令人着迷的想法。 从理论上讲,如果一个人能够在一个中传达两个值,那么神经网络也许能够找到某种更高维度,更快的结果解决方案,就像您将如何在三个维度上跳过一块蚂蚁一样,要看到前方的墙似乎永远延伸,就需要绕开它。
Diagram created by Author. 由作者创建的图。 Perhaps by utilizing imaginary numbers, neural networks will be able to explore a potential farther than they ever could currently.
也许通过利用虚数,神经网络将能够探索比现在更远的潜力。
Fortunately, this problem has not been left completely untouched, and there has been a decent amount of research conducted on complex-valued neural networks (CVNNs). The first problem addressed in almost all of these papers is the activation function — in order for backpropagation to work, an activation function that facilitates in the nonlinearity of a network must be analytic, or differentiable at all points. The Liouville theorem, however, shows that every bounded (closed) function on the complete complex plane must be constant, and hence, linear.
幸运的是,这个问题还没有被完全解决,并且对复值神经网络(CVNN)进行了大量的研究。 几乎所有这些论文中都解决的第一个问题是激活函数,为了使反向传播起作用,必须分析或在所有点上求导可促进网络非线性的激活函数。 但是,Liouville定理表明,完整复平面上的每个有界(封闭)函数都必须是常数,因此必须是线性的。
Therefore, in order to fulfill the purpose of the activation function, bounded functions like sigmoid and the hyperbolic tangent have no place in CVNNs. When applied to the complex plane, these bounded functions have singularity points — points that are not differentiable — that occur periodically in the space. At these singularity points, the function explodes towards infinity, causing the neural network computations to go haywire.
因此,为了实现激活函数的目的,诸如S形和双曲线正切之类的有界函数在CVNN中没有位置。 当应用于复杂平面时,这些有界函数具有奇点(不可微分的点),这些奇点在空间中定期出现。 在这些奇点处,函数向无穷大爆炸,导致神经网络计算陷入困境。
A common work-around this problem in the research of CVNNs is to use a complex sigmoid function, which inevitably will have certain singularity points at infinity, but to limit the search space for weights and bias parameters so that they always remain within a ‘safe’ region and never come even close to a computational explosion.
在CVNN的研究中,解决此问题的常见方法是使用复杂的S型函数,该函数不可避免会在无穷大处具有某些奇点,但会限制权重和偏差参数的搜索空间,以使它们始终保持在“安全范围内”。区域,而且永远都不会接近计算爆炸式增长。
Although there is significant mathematical work involved, simply put, a backpropagation algorithm for a complex-valued neural network needs to take into account the error with respect not only to the real component of a weight but also the imaginary component. Because it deals with these components separately, it expands the search space dimensionality by a magnitude of two.
简而言之,尽管涉及大量的数学工作,但用于复值神经网络的反向传播算法不仅需要考虑权重的实部,还需要考虑虚部。 由于它分别处理这些组件,因此将搜索空间维数扩展了两个数量级。
Because of this sparsity, a Random Search Algorithm (RSA) is used in the first few hundred epochs of training to hop around the space, finding a local minima that the complex-valued backpropagation algorithm — which is purposely set to have a very slow learning rate — is set to go. In this sense, the Random Search Algorithm is a heavy-duty initialization technique. Consider the training error using this method, with a RSA performed for the first 100 epochs and complex-valued backpropagation for the remainder:
由于这种稀疏性,在训练的前几百个时期中使用了随机搜索算法(RSA)来在空间中跳跃,从而找到了复数值反向传播算法的局部最小值,该算法故意设置为学习速度很慢率-即将开始。 从这个意义上说,随机搜索算法是一种重型初始化技术。 考虑使用此方法的训练错误,前100个时期执行RSA,其余部分执行复值反向传播:
Source: “Comparison of the Complex Valued and Real Valued Neural Networks Trained with Gradient Descent and Random Search Algorithms”. Image free to share. 资料来源:“用梯度下降和随机搜索算法训练的复值和实值神经网络的比较”。 图片免费分享。 It is clear that the RSA has found a suitable starting place from around the 75th epoch, which makes it easy for the gradient descent algorithm to follow a huge decrease in error and a following long, gradually diminishing series.
显然,RSA从第75个时代开始就找到了一个合适的起点,这使得梯度下降算法很容易跟随误差的大幅度减少以及随后的逐渐减小的序列。
Generally, papers testing out CVNN performance over real-valued neural networks (RVNN) see either similar or superior performance. By adding the additional dimensionality of imaginary numbers, an entire host of new problems arises, but the few solutions that have been constructed seem to be adequate enough to bring CVNNs on par with RVNNs. More recent research even involves the construction of complex convolutions, LSTMs, and batch normalizations to further assist in CVNNs.
通常,在实值神经网络(RVNN)上测试CVNN性能的论文会看到相似或优越的性能。 通过增加虚数的维数,出现了许多新问题,但是已经构造的少数解决方案似乎足以将CVNN与RVNN媲美。 最近的研究甚至涉及复杂卷积,LSTM和批处理规范化的构建,以进一步帮助CVNN。
Overall, complex neural networks can be summed up in one word: promising. While currently, their equivalent performance with mediocre standard neural networks but higher computational bill and decreased robustness to a diversity of problems is demoralizing, recent studies show that with enough developments, CVNNs vastly outperform their real-valued siblings on audio-related tasks (music transcription using the MusicNet dataset and on Speech Spectrum Prediction).
总体而言,复杂的神经网络可以用一个词概括:有希望。 虽然目前,它们与中等标准神经网络的等效性能,但更高的计算能力和对各种问题的鲁棒性下降令人沮丧,但最近的研究表明,随着足够的发展,CVNN在音频相关任务(音乐转录)方面大大超过其实际价值的同级产品。使用MusicNet数据集和语音频谱预测)。
While in the near future there is little hope that CVNNs will take the machine learning world by storm, continual developments in addressing the many complexities (pun intended) that arise may drastically increase its power for addressing specific tasks, for example in modelling signals (audio, electrical, etc.), much like other architectures have been designed for handling images and sequential data. The idea — of integrating imaginary numbers into neural networks — is definitely one to look out for.
虽然在不久的将来,CVNN几乎不可能席卷机器学习世界,但在解决出现的许多复杂关系 (双关语)方面的持续发展可能会大大提高其解决特定任务的能力,例如在建模信号时(音频,电气等),就像设计用于处理图像和顺序数据的其他体系结构一样。 将虚数集成到神经网络中的想法绝对是一个值得关注的想法。
进一步阅读 (Further reading)
- An Introductory Paper introducing the problems and solutions in complex neural networks.
介绍性论文介绍复杂神经网络中的问题和解决方案。
- An In-Depth Paper discussing specifically singularities and the problems of activation functions in complex planes. Good visualizations but mathematically intensive.
深入讨论特殊奇点和复杂平面中激活函数的问题。 良好的可视化效果,但数学强度很高。
- A Recent Paper introducing many mathematically intensive developments on complex neural networks showing superior performance on audio-related tasks.
最近的一篇论文介绍了复杂神经网络在数学上的大量开发,这些研究在音频相关任务上表现出卓越的性能。
翻译自: https://medium.com/analytics-vidhya/what-would-happen-if-neural-network-states-were-complex-numbers-dc5b47731184复数神经网络
推荐阅读
- 神经网络|一个神经网络有多少个神经元()
- 技术知识|为什么说BP神经网络就是人工神经网络的一种()
- 机器学习|什么是卷积神经网络
- 深度学习|卷积神经网络
- 左耳朵耗子(你得知道,技术不是用来写 CRUD 的)
- 人工智能|人脸识别最全知识图谱,中国学者数量全球第三—清华大学出品
- 人工智能|十年AI谁“最能打”(AI 2000榜单(何恺明最佳论文“刷”状元,谷歌机构排名第一...))
- 人工智能|解析|人脸识别最全知识图谱—清华大学出品
- 程序人生|十年AI学者影响力盘点(何恺明排名第一,成为全球最具影响力学者)