《前饋神經網絡》嘅修訂比較

Browse history interactively

← 上筆差異

內容刪咗內容加咗

VisualWikitext

喺2020年6月29號 (一) 09:28嘅修訂編輯 Dr. Greywolf（討論｜貢獻）管理員 163,397 edits m →‎睇埋 ← 上筆差異		喺 2023年2月10號 (五) 06:18 嘅最新修訂編輯復原 Dr. Greywolf（討論｜貢獻）管理員 163,397 edits m zap1
（無顯示另外2個用戶中途改嘅13版）
第1行： [[File:Colored neural network.svg\|thumb\|300px\|一個前饋人工神經網絡嘅抽象圖；每個圓圈代表一粒[[人工神經細胞\|神經細胞]]，每粒神經細胞嘅啟動程度淨係受佢打前嗰排神經細胞嘅啟動程度影響<ref>"[https://www.frontiersin.org/research-topics/4817/artificial-neural-networks-as-models-of-neural-information-processing Artificial Neural Networks as Models of Neural Information Processing \| Frontiers Research Topic]". Retrieved 2018-02-20.</ref>。]] {{ruby-yue\|'''前饋神經網絡'''~~（{{jpingauto~~\|cin4 gwai3 san4 ging1 mong5 lok6}}~~；[[英文]]：~~（{{lang-en\|'''feedforward neural network'''}}）係最簡單最早期嗰種[[人工神經網絡]]~~（ANN）~~<ref>Zell, Andreas (1994). ''Simulation Neuronaler Netze'' [Simulation of Neural Networks] (in German) (1st ed.). Addison-Wesley. p. 73.</ref>。一個前饋神經網絡會有一浸'''輸入層''~~（input layer）~~'（<code>input</code>）同一浸'''輸出層''~~（output layer），~~'（<code>output</code>），亦可能有一浸'''隱藏層''~~（hidden layer）~~'（<code>hidden</code>）<ref group="註">喺實際應用上，冇隱藏層嘅前饋神經網絡好多時都淨係搞得掂簡單嘅[[線性關係]]，所以有用嘅前饋神經網絡多數有隱藏層。</ref>。每一粒[[人工神經細胞\|神經細胞]]都有條噉嘅式<ref name="sch">Schmidhuber, J. (2015). "Deep Learning in Neural Networks: An Overview". ''Neural Networks''. 61: 85–117.</ref><ref>Ivakhnenko, A. G. (1973). ''Cybernetic Predicting Devices''. CCM Information Corporation.</ref>： :<math>t = W_1 A_1 + W_2 A_2...</math>；（[[啟動函數]]）喺呢條式當中，<math>t</math> 代表嗰粒神經細胞嘅啟動程度，<math>A_n</math> 代表前一排嘅神經細胞當中第 <math>n</math> 粒嘅啟動程度，而 <math>W_n</math> 就係其他神經細胞當中第 <math>n</math> 粒嘅權重（指嗰粒神經細胞有幾影響到 <math>t</math>）。<math>A_n</math> 當中唔包括任何前排以外嘅細胞，令成個網絡嘅[[訊號]]'''只會以一個方向傳遞'''－呢一點令前饋神經網絡好唔似[[生物神經網絡]]，亦都係前饋網絡同[[遞迴神經網絡]]~~（recurrent neural network）~~（RNN）嘅主要差異<ref name="diff">[https://towardsdatascience.com/the-differences-between-artificial-and-biological-neural-networks-a8b46db828b7 The differences between Artificial and Biological Neural Networks]~~{{Deadlink\|date=十月 2019 }}~~. ''Towards Data Science''.</ref>。雖然係噉，事實說明咗前饋神經網絡能夠輕易處理'''非連串性'''（non-sequential；一串[[文字]]就有連串性－前面嘅[[資訊息]]會影響後面嘅資訊息嘅意思）而且'''唔視乎時間''~~（not time-dependent；~~'（一個視乎時間嘅數據帶嘅資訊息會受時間影響，<math>\text{info} = f(\text{time})</math>）嘅[[數據]]<ref name="brilliantorg">[https://brilliant.org/wiki/feedforward-neural-networks/#:~:text=Feedfoward%20neural%20networks%20are%20primarily,)%20(x%2Cy). Feedforward neural network]. ''Brilliant.org''.</ref>，例如有[[遊戲~~人工智能~~ AI]] 方面嘅研究者試過成功噉訓練一部[[多層感知機]]（睇下面）玩[[食鬼]]<ref>Lucas, S. M. (2005, April). Evolving a Neural Network Location Evaluator to Play Ms. Pac-Man. In ''IEEE 2005 Symposium on Computational Intelligence and Games''.</ref>。所以就算到咗廿一世紀，前饋神經網絡都仲有人用<ref name="hosseini">Hosseini, H. G., Luo, D., & Reynolds, K. J. (2006). The comparison of different feed forward neural network architectures for ECG signal diagnosis. ''Medical engineering & physics'', 28(4), 372-378.</ref><ref name="shukla2009">Shukla, A., Tiwari, R., Kaur, P., & Janghel, R. R. (2009, March). Diagnosis of thyroid disorders using artificial neural networks. In ''2009 IEEE International Advance Computing Conference'' (pp. 1016-1020). IEEE.</ref>。 == 單層感知機 == 第29行： :<math>\text{output} = g(\overrightarrow{w} \cdot \overrightarrow{x} + b)</math>；<math>[1]</math> 當中 <math>\overrightarrow{x}</math> 係代表柞輸入嘅[[向量]]；<math>\overrightarrow{w}</math> 係代表柞權重嘅向量；而 <math>b</math> 代表'''偏向'''（bias），即係嗰粒神經細胞本身喺啟動上嘅傾向，例如如果有某一粒人工神經細胞嘅 <math>b</math> 係正數而且數值大，佢就會傾向無論輸入係幾多都有強烈嘅啟動。用嘅係[[監督式學習]]，個學習演算法要做嘅嘢就係按讀取到嘅數值調整柞 <math>w</math>，等個網絡將來會更加有能力俾到準確嘅輸出<ref name="auer2008"/>。 ;例子碼例如以下呢段用 [[Python 程式語言]]寫嘅[[源碼]]定義咗一個簡單嘅感知機神經網絡<ref group="註">呢部感知機未有機制改變權重，所以唔會識學習。</ref><ref name="firstneural">[https://towardsdatascience.com/first-neural-network-for-beginners-explained-with-code-4cfd37e06eaf First neural network for beginners explained (with code)]~~{{Deadlink\|date=二月 2020 }}~~. ''Towards Data Science''.</ref>： <source lang="python"> 第43行： {{clear}} === 誤差函數 === {{see also\|損失函數}} 要教部感知機學習，通常第一步係要界定一個[[誤差函數]]（error function）。學習[[定義]]上係指按照經驗改變自己嘅行為，所以一個認知系統要學習，其中一個最直接嘅做法係睇吓自己做嘅預測同實際經驗到嘅有幾大差異，誤差函數係指一個表達[[誤差]]（error）由邊啲變數同常數話事嘅[[函數]]（function），例如以下呢個就係一個常用嘅誤差函數<ref name="haykin2009">Haykin, S. S., Haykin, S. S., Haykin, S. S., Elektroingenieur, K., & Haykin, S. S. (2009). ''Neural networks and learning machines'' (Vol. 3). Upper Saddle River: Pearson education.</ref><ref>Jain, A. K., Mao, J., & Mohiuddin, K. M. (1996). Artificial neural networks: A tutorial. ''Computer'', (3), 31-44.</ref>：第48行 ⟶ 第49行： :<math>E(X) = \frac{1}{N}\sum_{i=1}^N(g(\overrightarrow{w} \cdot \overrightarrow{x} + b) - y_i)^2</math>；<math>[3]</math>（代咗 <math>[1]</math> 入去 <math>[2]</math> 嗰度）呢條式當中嘅 <math>E(X)</math> 反映咗個總誤差：<math>(\text{output}_i - y_i)^2</math> 表示第 <math>i</math> 個預測（<math>\text{output}_i</math>）同第 <math>i</math> 個實際經驗到嘅數值（<math>y_i</math>）之間嘅差異，呢個數字嘅次方一定會係[[正數]]，所以將所有次數嘅誤差加埋<ref group="註"><math>\sum</math> 係[[加總]]。</ref>就會反映部感知機做咗 <math>N</math> 次預測之後嘅總誤差。如果每次嘅預測都同實際經驗到嘅數值一樣（<math>\text{output}_i = y_i</math>），<math>E(X) = 0</math>。喺呢個情況下，一個學習演算法要做嘅嘢就係改變柞 <math>w</math> 嘅數值同埋 <math>b</math>，務求最後令到 <math>E(X)</math> 嘅數值有咁細得咁細<ref name="haykin2009"/>。 === Delta 法則 === 第69行 ⟶ 第70行： ;第二步：更新權重值 #每一個權重值，佢都會由條式嗰度有一個'''梯度值'''（gradient）； #每一個權重值嘅改變幅度等如個梯度值乘以 <math>\eta</math>－如果 <math>\eta</math> 係 0，噉個神經網絡永遠都唔會變，而如果 <math>\eta</math> 數值大，噉個神經網絡會變化得好快，所以 <math>\eta</math> 掌管咗個神經網絡學習有幾快； #將邇柞值「反向傳播」返去個神經網絡嗰度，將每個權重值變成佢嘅新數值（實際更新 <math>w_{ij}</math> 值）；第77行 ⟶ 第78行： === 局限 === [[單層感知機]]（single-layer perceptron；冇隱藏層嘅感知機）嘅局限在於佢係線性嘅分類機。單層感知機嘅感知機只能夠學識作出線性嘅分類，即係例如按兩個變數 <math>x</math> 同 <math>y</math> 將一柞個案分類，一個分類機會畫一條線，而條線係 <math>x</math> 同 <math>y</math> 嘅函數（例：<math>y = 2x + 5</math>），如果呢條線能夠正確噉分開兩類個案，條線就係一部成功嘅''[[線性分類機'']]（linear classifier）；根據研究，單層感知機淨係處理得到[[線性]]（linear）嘅關係，如果個實際關係唔係線性，單層感知機就會搞唔掂。想像以下呢兩幅圖： {{clear}} [[File:Kernel Machine.svg\|510px\|center]] 第85行 ⟶ 第86行： == 多層感知機 == {{main\|多層感知機}} [[多層感知機]]（multi-layer perceptron，MLP）係一個包含多部感知機嘅人工神經網絡：多層感知機有'''隱藏層'''（hidden layer），即係唔會直接收外界輸入又唔會直接向外界俾輸出嘅神經細胞層；同單層感知機唔同嘅係，多層感知機能夠處理非線性嘅關係，喺好多人工神經網絡應用上都有價值<ref>Pal, S. K., & Mitra, S. (1992). ''Multilayer perceptron, fuzzy sets, classifiaction''.</ref>。一部三層（有一浸隱藏層）嘅感知機可以想像成以下噉嘅樣<ref name="sch"/>： {{clear}} [[File:Artificial neural network.svg\|360px\|center]] 第92行 ⟶ 第93行：定義上，多層感知機具有以下嘅特徵<ref name="sch"/>：每粒第 <math>i</math> 層嘅神經細胞都同第 <math>i-1</math> 層嘅神經細胞有連繫，即係話每粒第 <math>i-1</math> 層嘅神經細胞都有能力影響第 <math>i</math> 層嘅神經細胞嘅啟動程度，即係每層之間都'''完全連繫''~~（fully connected），~~'，不過權重值可以係 0；第 <math>i</math> 層嘅神經細胞唔會受第 <math>j</math> 層嘅神經細胞影響，當中 <math>j</math> 係任何一個大過 <math>i</math> 嘅整數；同一層嘅神經細胞之間冇連繫。 === 反向傳播算法 === {{main\|反向傳播算法}} [[反向傳播算法]]（backpropagation）係 [[delta 法則]]（睇上面）嘅廣義化：喺得到誤差函數之後，就可以計柞 <math>w</math> 要點調整<ref>Nielsen, Michael A. (2015). "Chapter 6". ''Neural Networks and Deep Learning''.</ref><ref>Kelley, Henry J. (1960). "Gradient theory of optimal flight paths". ''ARS Journal''. 30 (10): 947–954.</ref>，例如~~''確率勾配降~~[[隨機梯度下降法'']]（stochastic gradient descent）噉，就會運用以下呢條算式嚟計出每個權重值要點變<ref>Mei, Song (2018). "A mean field view of the landscape of two-layer neural networks". ''Proceedings of the National Academy of Sciences''. 115 (33): E7665–E7671.</ref>： :<math>w_{ij}(t + 1) = w_{ij}(t) + \eta\frac{\partial E(X)}{\partial w_{ij}} +\xi(t) </math>；<math>[5]</math> 當中第105行 ⟶ 第106行： <math>E(X)</math> 係個誤差，反映咗喺個個案入面個神經網絡俾嘅輸出同正確輸出差幾遠； <math>\frac{\partial E(X)}{\partial w_{ij}}</math> 係 <math>E(X)</math> 隨住 <math>w_{ij}</math> 嘅[[偏導數]]（partial derivative）； <math>\xi(t) </math> 係一個''[[隨機~~''（stochastic）~~]]嘅數值<ref>Dreyfus, Stuart (1962). "The numerical solution of variational problems". ''Journal of Mathematical Analysis and Applications''. 5 (1): 30–45. </ref><ref>Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986). "Learning representations by back-propagating errors". ''Nature''. 323 (6088): 533–536.</ref>。如果一個以電腦程式寫嘅神經網絡跟呢條式（或者係類似嘅式）嚟行嘅話，佢喺計完每一個個案之後，都會計出佢裏面嘅權重值要點樣改變，並且將呢個「每個權重應該要點變」嘅資訊息傳返去個網絡嗰度（所以就叫「反向傳播」）。而每次有個權重值改變嗰陣，佢嘅改變幅度會同「誤差值」有一定嘅關係，而且佢對計個輸出嘅參與愈大，佢嘅改變幅度會愈大<ref>Dreyfus, Stuart (1973). "The computational solution of optimal control problems with time lag". ''IEEE Transactions on Automatic Control''. 18 (4): 383–385.</ref>－個神經網絡會一路計個案一路變，變到誤差值愈嚟愈接近零為止<ref>Dreyfus, Stuart E. (1990-09-01). "Artificial neural networks, back propagation, and the Kelley-Bryson gradient procedure". ''Journal of Guidance, Control, and Dynamics''. 13 (5): 926–928. </ref>。而除咗確率勾配降下法之外，反向傳播仲有好多其他方法做，詳情可以睇[[最佳化]]（optimization）相關嘅課題<ref>Huang, Guang-Bin; Zhu, Qin-Yu; Siew, Chee-Kheong (2006). "Extreme learning machine: theory and applications". ''Neurocomputing''. 70 (1): 489–501.</ref><ref>Widrow, Bernard; et al. (2013). "The no-prop algorithm: A new learning algorithm for multilayer neural networks". ''Neural Networks''. 37: 182–188.</ref>。多層感知機嘅訓練演算法同單層感知機嘅基本上一樣。 == 應用例子碼 == 以下係一個用 [[C#]] 整嘅多層感知機網絡[[源碼]]（「initialize」係指[[初始化]]）<ref name="tdsCsharpANN">[https://towardsdatascience.com/building-a-neural-network-framework-in-c-16ef56ce1fef Building a neural network framework in C#]. ''Towards Data Science''.</ref>：第294行 ⟶ 第295行： Foreach 網絡，佢哋睇吓個網絡做完學習之後做預測嘅準確性係點。佢哋發現複合前饋網絡嘅表現好過普通就噉一個前饋網絡嘅－所以佢哋就發現咗一啲有用嘅嘢，可以將佢哋嘅研究成果喺有關[[機械學習]]嘅學術期刊上公佈。 {{clear}} == 註釋 ==▼ {{Reflist\|group=註\|3}}▼ ~~{{clear}}~~ == 睇埋 == [[人工神經網絡]] 第304行 ⟶ 第300行： [[反向傳播算法]] [[機械學習]] [[自編碼器]]，；喺最簡單嗰種情況下，一個自編碼器係以輸入做預想輸出、隱藏層細胞數量少嘅前饋網絡。 == 參考文獻 == ~~{{div col\|colwidth=30em}}~~ Abu Dalffa, M., Abu-Nasser, B. S., & Abu-Naser, S. S. (2019). [http://dstore.alazhar.edu.ps/xmlui/bitstream/handle/123456789/142/DALTLUv1.pdf?sequence=1&isAllowed=y Tic-Tac-Toe Learning Using Artificial Neural Networks] (PDF). Bhaskar, K., & Singh, S. N. (2012). [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6170987 AWNN-assisted wind power forecasting using feed-forward neural network] (PDF). ''IEEE transactions on sustainable energy'', 3(2), 306-315. Valian, E., Mohanna, S., & Tavakoli, S. (2011). Improved cuckoo search algorithm for feedforward neural network training. ''International Journal of Artificial Intelligence & Applications'', 2(3), 36-43. ~~{{div col end}}~~ {{clear}} ▲== 註釋 == ▲{{Reflist\|group=註\|3}} == 攷 == {{reflist\|3}} == 拎 == [https://web.archive.org/web/20090507210502/http://www.emilstefanov.net/Projects/NeuralNetworks.aspx Feedforward neural networks tutorial]. [https://web.archive.org/web/20090923121811/http://wiki.syncleus.com/index.php/DANN%3ABackprop_Feedforward_Neural_Network Feedforward Neural Network: Example]. *[http://media.wiley.com/product_data/excerpt/19/04713491/0471349119.pdf Feedforward Neural Networks: An Introduction]. {{~~Template:~~機械學習}}▼ ▲{{Template:機械學習}} [[Category:人工神經網絡]]