2021年1月5日 / 10次阅读 / Last Modified 2021年1月6日

本文用深度神经网络,来表示具有多hidden layer的神经网络。

最初的感知机网络,只有两层,input layer和output layer,因此minsky会说这样的网络连异或XOR这样的规则都学不会。但其实minsky同样也提到了,增加网络的层次,可以解决学不会XOR的问题,只是当时少有人关注到,神经网络的研究进入冰河期。

An MIT professor named Marvin Minsky (who was a grade behind Rosenblatt at the same high school!), along with Seymour Papert, wrote a book called _Perceptrons_ (MIT Press), about Rosenblatt's invention. They showed that a single layer of these devices was unable to learn some simple but critical mathematical functions (such as XOR). In the same book, they also showed that using multiple layers of the devices would allow these limitations to be addressed. Unfortunately, only the first of these insights was widely recognized. As a result, the global academic community nearly entirely gave up on neural networks for the next two decades.

后来不管怎么样,神经网络还是在发展,科学家们证明了带有hidden layer的神经网络,实际上就是个通用函数模拟器,universal approximator,可以模拟任何函数。1986年,可加速计算梯度的BP算法重新浮出水面,让大家认识到其价值,神经网络又来了一个春天。不过,很多人的实践还是停留在只有一层hidden layer的网络结构上。只有1层hidden layer,理论上可以,但是实际上它更加消耗计算资源。假设有两个神经网络的能力一样,有多层hidden layer的网络在计算上消耗的资源,要少于只有一层hidden layer的网络,带来这个结果的原因,很可能是神经元的总数量前者要低,虽然层次更多。(如果用软件开发来类比,就是良好的分层设计,可以减少代码量,增加可维护性,更容易优化)

In the 1980's most models were built with a second layer of neurons, thus avoiding the problem that had been identified by Minsky and Papert (this was their "pattern of connectivity among units," to use the framework above). And indeed, neural networks were widely used during the '80s and '90s for real, practical projects. However, again a misunderstanding of the theoretical issues held back the field. In theory, adding just one extra layer of neurons was enough to allow any mathematical function to be approximated with these neural networks, but in practice such networks were often too big and too slow to be useful.

Although researchers showed 30 years ago that to get practical good performance you need to use even more layers of neurons, it is only in the last decade that this principle has been more widely appreciated and applied. Neural networks are now finally living up to their potential, thanks to the use of more layers, coupled with the capacity to do so due to improvements in computer hardware, increases in data availability, and algorithmic tweaks that allow neural networks to be trained faster and more easily. We now have what Rosenblatt promised: "a machine capable of perceiving, recognizing, and identifying its surroundings without any human training or control."

饶了这么多,我想说的第一个问题是:多层(多hidden layer)深度神经网络,这种pattern与我们很多其它方面的事情竟然保持着高度的一致!





So deep circuits make the process of design easier. But they're not just helpful for design. There are, in fact, mathematical proofs showing that for some functions very shallow circuits require exponentially more circuit elements to compute than do deep circuits. For instance, a famous series of papers in the early 1980s* (*The history is somewhat complex, so I won't give detailed references. See Johan Håstad's 2012 paper On the correlation of parity and small-depth circuits for an account of the early history and references. ) showed that computing the parity of a set of bits requires exponentially many gates, if done with a shallow circuit. On the other hand, if you use deeper circuits it's easy to compute the parity using a small circuit: you just compute the parity of pairs of bits, then use those results to compute the parity of pairs of pairs of bits, and so on, building up quickly to the overall parity. Deep circuits thus can be intrinsically much more powerful than shallow circuits.

分层的好处是显而易见的,各自解决各自的问题,相互之间保持一定的独立性,独立发展,内部结构也很清晰。能够被分层分解的,都是可以更好解决或更好发展的。所有这一切,我感觉都与多hidden layer的神经网络,神似!从输入层到输出层,中间很多hidden layer,各自干着我们现在还说不清道不明的事情。






-- EOF --



电子邮件地址不会被公开。 必填项已用*标注



©Copyright 麦新杰 Since 2019 Python笔记

go to top