進階搜尋


 
系統識別號 U0026-0812200915370926
論文名稱(中文) 符合ARMv6格式多核心系統之Dual-port Memory Management Unit之設計
論文名稱(英文) Design of a dual-port Memory Management Unit for multi-core system conforming to ARMv6 format
校院名稱 成功大學
系所名稱(中) 電機工程學系專班
系所名稱(英) Department of Electrical Engineering (on the job class)
學年度 97
學期 2
出版年 98
研究生(中文) 李宣賢
研究生(英文) Hsuan-hsien Lee
電子信箱 n2795122@mail.ncku.edu.tw
學號 n2795122
學位類別 碩士
語文別 中文
論文頁數 76頁
口試委員 口試委員-黃俊岳
指導教授-陳中和
口試委員-謝明得
口試委員-侯廷偉
中文關鍵字 dual-port TLB  多核心系統  BPLRU  ARMv6 MMU架構 
英文關鍵字 BPLRU  multi-core system  dual-port TLB  ARMv6 MMU architecture 
學科別分類
中文摘要 多核心系統之記憶體管理單元(Memory Management Unit,MMU)設計的挑戰在於多核心系統支援、高效率與低功率消耗。因此,本論文專注探討分析MMU架構、dual-port架構與TLB替換機制來實現解決這些挑戰且實現符合ARMv6架構之記憶體管理單元。

實驗的環境為本實驗室所設計之Superscalar平台,包含一個以Register Update Unit Base所設計之九級管線超純量架構:Symphony32,以及同為dual-port之data cache。於最後數據分析中可以看出:在單核心系統下,dual-port架構的記憶體存取次數較single-port架構減少均方根值9.72%;並減少處理器存取記憶體時間均方根值19.25%。面積更只有採用direct-mapped TLB架構的四成,以較小的面積,保有最佳化效率及較低的功率損耗。
英文摘要 In this thesis, we design a Memory Management Unit (MMU) which supports multi-core system and conforms to the ARMv6 format. We analyze and discuss the architecture of ARMv6 MMU. First, we explain the benefits of using dual-port structure. Secondly, we analyze and select the replacement algorithm for the TLB. Finally, we show the implementation of the ARMv6 MMU which bases on the outcome of the evaluation and the real needs.

A Superscalar platform including Symphony32, a Register Update Unit Based nine-stage pipeline superscalar processor, the dual-port MMU, and a dual-port data cache is our experimental environment. Based on this single-core environment, the simulation results illustrate that the TLB architecture and the replacement policy are the key factors that reduce 99.2% of TLB misses on average compared with a direct-mapped design and 9.72% reduction of memory access compared with the single-port architecture. Therefore, the proposed MMU clearly presents itself to be an effective MMU design for our multi-core system.
論文目次 摘要 III
Abstract IV
目錄 V
表目錄 IX
圖目錄 X
第1章 序論 1
1.1 研究動機 1
1.2 研究貢獻 1
1.3 論文編排 2
第2章 背景知識 3
2.1 Virtual memory 3
2.2 MMU 4
2.2.1 Execution mode 5
2.2.2 Access permission 6
2.2.3 Cache policy 6
2.3 About TLB 7
2.3.1 TLB 7
2.3.2 Code Translations for TLB Power Reduction 8
2.3.3 Two-level TLB architecture 8
2.3.4 Reduce power consumption of TLB 9
第3章 架構設計與分析 11
3.1 Benefits of ARMv6 MMU structure 11
3.2 Benefits of dual-port structure 12
3.3 Replacement algorithm 13
3.3.1 Round-Robin replacement algorithm 14
3.3.2 LRU replacement algorithm 14
3.3.3 BPLRU replacement algorithm 14
3.3.4 結論 16
3.4 New pipeline structure for Symphony32 17
3.4.1 Traditional pipeline of write sequence 17
3.4.2 New pipeline of write sequence 18
3.4.3 Pipeline of read sequence 19
3.5 結語 20
第4章 架構實現 21
4.1 Overview 21
4.2 Control module 22
4.2.1 IMMU control module 22
4.2.2 DMMU control module 24
4.3 Replace module 27
4.4 TLB module 27
4.4.1 Instruction TLB 28
4.4.2 Data TLB 28
4.5 Comparator module 30
4.6 Table walk module 31
4.6.1 Hardware page table translation 31
4.6.2 Backwards-compatible page table translation 33
4.6.3 ARMv6 page table translation 34
4.6.4 Section format 37
4.6.5 Page format 38
4.6.6 Control FSM of table walk module 42
4.7 Fault check module 43
4.7.1 Memory access control 43
 Domains 44
 Access permissions 44
 Execute never bit 45
 Access Bit 46
4.7.2 MMU aborts 46
4.7.3 MMU fault checking 47
 Alignment fault 47
 Translation fault 47
 Access bit fault 47
 Domain fault 48
 Permission fault 48
4.7.4 Fault status and address 48
4.7.5 Memory attributes and types 49
4.7.6 Memory region attributes 50
4.7.7 Implementation 52
 Check alignment fault 52
 External abort on translation 53
 Access bit fault 53
 Translation fault 53
 Domain fault 54
 Permission fault 55
4.8 結語 56
第5章 實驗環境與數據分析 58
5.1 環境架設 58
5.2 測式程式與軟體平台 59
5.2.1 Experiment environment 61
5.3 實驗數據與結果分析 62
5.3.1 Single-core platform 62
 Area and timing analysis 62
 Hit ratios of iTLB 64
 Hit ratio of dTLB 65
 Reduced number of memory access 67
 Reduced time of memory access 68
5.3.2 Dual-core platform 69
 Reduced time of memory access 69
 Performance improvement 70
5.4 結語 71
第6章 結論與未來展望 72
6.1 結論 72
6.2 未來展望 73
參考文獻 74
參考文獻 [1] ARM Corporation, “ARM11 MPCore Processor Technical Reference Manual”.
[2] ARM Corporation, “ARM Architecture Reference Manual”.
[3] MIPS, “MIPS R4000 Microprocessor User’s Manual”.
[4] 林璟汶, “符合電子系統層級設計概念之可參數化超純量亂序執行微處理器設計、分析與實現,” 碩士論文, 國立成功大學電腦與通信工程研究所, 2008.
[5] T. Takayanagi, et al., “Embedded Memory Design for a Four Issue Superscalar RISC Microprocessor,” In Proc. IEEE 1994 Conf. on Custom Integrated Circuits, pp 585-590, May 1994.
[6] I. Kadayif, A. Sivasubramaniam, M. Kandemir, G. Kandirajuand, and G. Chen, “Generating Physical Address Directly for Saving Instruction TLB Energy,” In Proc. 35th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 185-196. Nov. 2002.
[7] R. Jeyapaul, S. Marathe, and A. Shrivastava, “Code Transformations for TLB Power Reduction,” In International Conference on VLSI Design, pp: 413-418, Jan. 2009.
[8] J. H. Choi, J. H. Lee, S. W. Jeong, S. D. Kim, and C. Weems, “A Low Power TLB Structure for Embedded Systems,” In Computer Architecture Letters, Vol. 1, pp: 3-3, Jan. 2002.
[9] Y.-J. Chang, “An Ultra Low-Power TLB Design,” In Proc. 2006 DATE ’06 on Design, Automation and Test in Europe, Vol. 1, pp: 1-6, Mar. 2006.
[10] Y.-J. Chang and M.-F. Lan, “Two New Techniques Integrated for Energy-Efficient TLB Design,” In IEEE Trans. on Very Large Scale Integration (VLSI) Systems, Vol. 15, pp: 13-23, Jan. 2007.
[11] J. R. Haigh, M. W. Wilkerson, J. B. Miller, T. S. Beatty, S. J. Strazdus, and L. T.Clark, “A low-power 2.5-GHz 90-nm level 1 cache and memory management unit,” In IEEE Journal of Solid-State, Vol. 40, pp: 1190-1199, May 2005.
[12] Mibench test bench, http://www.eecs.umich.edu/mibench/
[13] ARM Corporation, “ARM Developer Suite - Version 1.2 - ADS Debug Target Guide.”
[14] ARM Corporation, “ARM Developer Suite - Version 1.2 - ARM ELF Specification.”
[15] ARM Corporation, “ARM Developer Suite - Version 1.2 - AXD and arm Debuggers Guide.”
[16] ARM Corporation, “ARM Developer Suite - Version 1.2 - Assembler Guide.”
[17] ARM Corporation, “ARM Developer Suite - Version 1.2 - Codewarrior IDE Guide.”
[18] ARM Corporation, “ARM Developer Suite - Version 1.2 - Compilers and Libraries Guide.”
[19] ARM Corporation, “ARM Developer Suite - Version 1.2 - Developer Guide.”
[20] ARM Corporation, “ARM Developer Suite - Version 1.2 - Getting Started.”
[21] ARM Corporation, “ARM Developer Suite - Version 1.2 - Installation and License Management Guide.”
[22] ARM Corporation, “ARM Developer Suite - Version 1.2 - Linker and Utilities Guide.”
[23] H. Ghasemzadeh, S. Mazrouee, and M. Reza Kakoee, “Modified Pseudo LRU Replacement Algorithm,” In ECBS 2006, 13th Annual IEEE International Symposium and Workshop on Engineering of Computer Based Systems, pp: 376-342, Mar. 2006.
[24] C.-Y. Tseng and H.-C. Chen, “The Design of Way-Prediction Scheme in Set-Associative Cache for Energy Efficient Embedded System,” In 2009, CMC ’09. WRI International Conference on Communications and Mobile Computing, Vol. 3, pp: 3-7, Jan. 2009.
[25] M. Soryani, M. Sharifi, and M. Hossein Rezvani, “Performance Evaluation of Cache Memory Organizations in Embedded Systems,” In ITNG ’07. Fourth International Conference on Information Technology, pp: 1045-1050, Apr. 2007.
[26] H.-K. Jung, et al., “Performance Improvement and Low Power Design of Embedded Processor,” In 2008. ICCIT ’08. Third International Conference on Convergence and Hybrid Information Technology, Vol. 2 pp: 140-145, Nov. 2008.
[27] TSMC 0.18 μm Process 1.8-Volt SAGE-XTM Standard Cell Library Databook
[28] Stanford Parallel Applications for Shared Memory (SPLASH) test bench, http://www-flash.stanford.edu/apps/SPLASH/
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2009-08-27起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2009-08-27起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw