進階搜尋


下載電子全文  
系統識別號 U0026-2108201215362500
論文名稱(中文) 適用於RISC處理器之歷程快取記憶體架構
論文名稱(英文) Trace Reuse Cache for RISC Processor Architecture
校院名稱 成功大學
系所名稱(中) 電腦與通信工程研究所
系所名稱(英) Institute of Computer & Communication
學年度 100
學期 2
出版年 101
研究生(中文) 郭漢衿
研究生(英文) Hang-Chin Kuo
學號 q36984040
學位類別 碩士
語文別 中文
論文頁數 38頁
口試委員 指導教授-陳中和
口試委員-邱瀝毅
口試委員-蘇文鈺
中文關鍵字 歷程快取記憶體  功率效能  精簡指令集處理器 
英文關鍵字 Trace Reuse Cache  power consumption  RISC 
學科別分類
中文摘要 針對RISC CPU執行時的效能之改進與降低功率消耗,本論文提出利用新的指令快取記憶體機制並配合內嵌於CPU 指令遞送級的支援電路讓指令的使用得到更佳的效果,並對整個系統架構作性能分析。
本架構採用ARM處理器為基礎的RISC CPU,配合低功率指令遞送機制的歷程快取記憶體(Trace Reuse cache)的小幅度修改,讓處理器能夠在不增加過多面積的情況下支援本架構所採用的處理器指令運送機制,達到降低指令快取記憶體的能源耗用率,並盡可能提昇執行效能的目標。整體的架構將透過EDA設計工具來測得效能改善率,並與原始架構作比較並分析優劣,取得較為準確的實驗數據。
根據實驗結果,使用歷程快取記憶體搭配傳統指令快取記憶體的系統相較於一般只使用傳統指令快取記憶體的系統,能夠達到更好的表現效能,並且為降低整個系統的能源耗用率方面做出進一步的貢獻。除了提供設計者能夠有更多選項外,也讓整體系統的功能性有提昇的空間。
英文摘要 In this thesis, we propose a new mechanism named Trace Reuse Cache (TRC) for instructions delivery to reduce power consumption of RISC processor architecture. With additional circuit in CPU fetch stage and TRC, it is possible to choose suitable instructions in the system for reuse, and to get better performance. We also analyze the whole system and evaluate the IPC, hit rate, and power consumption.
Based on an ARM-compatible 5-stage RISC core, we attach the TRC to the cache memory system and modify part of the pipeline architecture for supporting TRC instruction delivery with small area overhead. The purpose is to lower the power consumption of cache itself and benefit the instruction fetching performance.
Experimental result shows that there is less power dissipation in our work than traditional case only with an instruction cache. Moreover, TRC provides the designer with an additional option, and has possibility to enhance system.
論文目次 摘要 I
Abstract II
誌謝 III
目錄 IV
表目錄 VI
圖目錄 VII
第一章 序論 1
1.1 研究動機 1
1.2 研究貢獻 2
1.3 論文架構 2
第二章 背景知識與相關研究 4
2.1 Trace cache 4
2.2 Filter cache 5
2.3 Modified cache 6
2.4 功率評估 8
第三章 系統架構的設計與實現 9
3.1 Trace Reuse Cache架構 9
3.2 Trace Reuse Cache組成概述 10
3.3 HTB的結構與運作 11
3.4 TET的結構與運作 13
3.5 Trace紀錄建立 16
3.6 TRC的指令遞送機制 17
第四章 驗證環境與實驗方法 21
4.1 模擬環境 21
4.2 效能測定與分析 22
4.3 功率效能評估 23
第五章 實驗結果與數據分析 26
5.1 IPC結果分析 26
5.2功率評估與分析 27
5.3 TRC遞送率 30
第六章 結論與未來展望 35
6.1 結論 35
6.2 未來展望 35
參考文獻 37
參考文獻 [1] Yi-Ying Tsai and Chung-Ho Chen, “Energy-Efficient Trace Reuse Cache for Embedded Processors,” IEEE Transactions on VLSI, Vol. 19, NO.9, September 2011, pp.1681-1694.

[2] C. Yang and A. Orailoglu, “Power-efficient instruction delivery through trace reuse,” Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, 2006, pp.192-201.

[3] A. Hossain, D. J. Pease, J. S. Burns, and N. Parveen, “Trace Cache Performance Parameters,” Proceedings of the 2002 IEEE International Conference on Computer Design, February 2002, pp.348-355.

[4] J. Kin, M. Gupta, and W. H. Magione-Simth, “Filter Cache: An Energy Efficient Memory Structure,” Proceedings of the 30th International Symposium on Microarchitecture, December 1997, pp.184-193.

[5] J. Kin, M. Gupta, and W. H. Magione-Simth, “Filtering memory references to increase energy efficiency,” IEEE Transaction on Computers, January 2000, Vol.49, pp.1-15.

[6] N. Bellas, I. Hajj, C. Polychronopoulos, and G. Stamoulis, “Energy and Performance Improvements in Microprocessor Design using a loop cache,” Proceedings of the International Conference on Computer Design, October 1999, pp.378-383.

[7] HP Labs, “CACTI: An integrated cache and memory access time, cycle time, area, leakage, and dynamic power model,” http://www.hpl.hp.com/research/cacti/ .

[8] S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi, “CACTI 5.1,” Technical Report HPL-2008-20, HP Laboratories Palo Alto, April 2, 2008.

[9] Chung-Ho Chen, Chih-Kai Wei, Tai-Hua Lu and Hsun-Wei Gao, “Software-based Self-Testing with Multiple-Level Abstractions for Soft Processor Cores,” IEEE Transactions on VLSI Systems, May 2007, Vol.15, pp.505-517.

[10] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T.M. Austin, T. Mudge, and R. B. Brown. “MiBench: A free, commercially representative embedded benchmark suite,” Proceedings of the IEEE 4th Annual Workshop on Workload Characterization, December 2001, pp.3-14.

[11] E. Rotenberg, S. Bennett, and J. E. Smith, “A Trace Cache Microarchitecture and Evaluation,” IEEE Transactions on Computers, February 1999, Vol. 48, Issue 2, pp.111-120.

[12] Chun-Hung Lai, Fu-Ching Yang, and Ing-Jer Huang, “A Trace-Capable Instruction Cache for Cost-Efficient Real-Time Program Trace Compression in SoC,” IEEE Transactions on Computers, December 2011, Vol. 60, Issue 12, pp.1665-1677.

[13] Filipa Duarte, and Stephan Wong, “Cache-Based Memory Copy Hardware Accelerator for Multicore Systems,” IEEE Transactions on Computers, November 2010, Vol. 59, Issue 11, pp. 1494-1507.

[14] Stephan Wong , Filipa Duarte, and Stamatis Vassiliadis, “A Hardware Cache memcpy Accelerator,” Field Programmable Technology, 2006. FPT 2006. IEEE International Conference, December 2006, pp. 141-148.

[15] Stamatis Vassiliadis, Filipa Duarte, and Stephan Wong, “A Load-Store Unit for a memcpy Hardware Accelerator,”Field Programmable Logic and Applications, 2007. FPL 2007. International Conference, August 2007, pp. 537-541.

[16] Ji Gu, Hui Guo, and Patrick Li, “ROBTIC An On-Chip Instruction Cache Design for Low Power Embedded Systems,”IEEE Conferences on RTCSA, August 2009, pp.419-424.

論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2015-09-03起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2015-09-03起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw