進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-3107201813464700
論文名稱(中文) 利用電子筆記與線上資料之教材摘要歸納方法
論文名稱(英文) Summarization of Learning Materials Using Digital Notes and Online Data
校院名稱 成功大學
系所名稱(中) 資訊管理研究所
系所名稱(英) Institute of Information Management
學年度 106
學期 2
出版年 107
研究生(中文) 林真瑜
研究生(英文) Chen-Yu Lin
學號 R76054064
學位類別 碩士
語文別 中文
論文頁數 63頁
口試委員 指導教授-王惠嘉
口試委員-劉任修
口試委員-高宏宇
口試委員-莊坤達
中文關鍵字 筆記摘要  主題識別 
英文關鍵字 Note Summarization  Topic Identification 
學科別分類
中文摘要 大部份的同學上課時都會記錄筆記,然而同學在聽課時,經常會發生疏漏而沒有記錄全部的教學內容。因此本研究中參考共享經濟(sharing economy)的想法,透過連結同學們的筆記以彌補各自筆記中缺漏的部份。
承上文,整合共享筆記可以提升學習效果;然僅憑學生在課堂上抄寫的筆記不足以完全了解教師的解釋或涵蓋考試的內容,為達學習效果,學生常須花費時間自行查閱課本,卻常因課文內容眾多而不易擷取重點。除了上課筆記,教師所強調的重點常反應在課堂投影片;有時教師想強調的觀念在課本中卻幾句帶過,此時便須上網搜尋相關資料,但因網路資料量龐大造成資訊過載的問題,使同學須花費更多時間過濾沒有幫助的網路資源。因此本研究提出一個自動整合學習資料的方法,共享同學筆記,並透過投影片、課本、網頁綜合出有助於學習者理解的內容。
本研究進行了六個實驗。實驗一證實加入主題字詞協助翻譯筆記確實能提升翻譯品質與內容連結效果;實驗二發現本系統在生產管理的連結效果略優於資料結構;實驗三利用Jensen-Shannon (JS) Divergence驗證本系統在不同科目中的摘要效果,結果顯示本系統在生產管理中產生的摘要會比資料結構中產生的摘要更貼近課程內容;而實驗四則發現學習資料整合模組的「是否包含主題字詞」、「非主題字數量」與「重要詞性字詞的比例」三個屬性在不同科目有不同影響,且都不容忽視必須同時考量;實驗五比較本系統與過去幾種自動化文件摘要方法,發現本系統產生的摘要與筆記和考題的內容都更相近;實驗六針對幾種方法進行人工評估,發現本系統無論在語意流暢性(readability)、內容涵蓋度(informativeness)或內容完整性(completeness)的表現都十分優異。最後兩個實驗證實本系統的摘要不僅更符合教師上課所著重的內容,且能夠幫助學生更快擷取課程的重點。
英文摘要 Most students take notes when they are in class, but in most cases, they couldn’t write down the whole contents during class. Students can get the missing parts if they share their notes with each other. This study refers to the idea of the sharing economy and links the students' notes to complete the missing parts of each one.
Students and teachers are getting used to electronic teaching materials. The emphases of courses are often integrated into slides. It’s difficult to understand what teachers want to convey just by notes or slides. However, the contents of the textbook are so tedious that students cannot get the point easily. Furthermore, what teachers emphasize is sometimes just mentioned without going into deeply in the textbook, so that students need to search for relevant information, which takes much time to filter out unhelpful information. Therefore, this paper proposed a method called NoteSum to integrate and summarize learning materials automatically, which will help students learn.
This study conducted several experiments and got the following conclusions. First, the Jensen-Shannon (JS) Divergence was used to assess the summary, and those generated for the discussion course performed better. Next, the three attributes—the presence of topic terms, the number of non-topic words, and the ratio of the words with important part of speech—had different effects on different subjects. At last, we compared NoteSum with other summarization systems through automatic and manual evaluation and verified that NoteSum was more powerful and could help students learn more quickly.
論文目次 第1章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
1.3 研究範圍、限制與假設 4
1.4 研究流程 4
1.5 論文大綱 5
第2章 文獻探討 6
2.1 自然語言處理 6
2.1.1 斷詞處理 6
2.1.2 n-gram 7
2.1.3 移除停用字 7
2.2 文件切割 8
2.2.1 線性文件切割 8
2.2.2 階層式文件切割 10
2.3 筆記摘要 11
2.3.1 節錄式摘要 12
2.3.2 重寫式摘要 13
2.4 小結 14
第3章 研究方法 15
3.1 系統架構 15
3.2 主題識別模組 16
3.2.1 課本內文前處理 16
3.2.2 課本目錄前處理 17
3.2.3 書目索引前處理 17
3.2.4 投影片前處理 19
3.2.5 筆記前處理 20
3.2.6 主題擷取 24
3.3 內容連結模組 24
3.3.1 筆記共享 24
3.3.2 課文分段 27
3.3.3 投影片與課文段落連結 29
3.3.4 學習資料連結 29
3.4 輔助型資料蒐集模組 30
3.5 學習資料整合模組 32
3.5.1 是否包含主題字詞 32
3.5.2 非主題字數量 33
3.5.3 重要詞性字詞的比例 33
3.5.4 句子評分與選取 33
3.6 小結 36
第4章 系統建置與驗證 37
4.1 系統環境建置 37
4.2 實驗方法 37
4.2.1 資料來源 38
4.2.2 評估指標 39
4.3 參數設定 39
4.4 實驗結果與分析 43
4.4.1 實驗一 43
4.4.2 實驗二 44
4.4.3 實驗三 46
4.4.4 實驗四 47
4.4.5 實驗五 48
4.4.6 實驗六 50
4.5 小結 52
第5章 結論與未來方向 53
5.1 研究成果 53
5.2 未來研究方向 56
參考文獻 58
英文文獻 58
中文文獻 63
參考文獻 Allen, J. F. (2003). Natural Language Processing Encyclopedia of Computer Science (pp. 1218–1222): John Wiley and Sons Ltd.
Baralis, E., & Cagliero, L. (2016). Learning from Summaries: Supporting e-Learning Activities by Means of Document Summarization. IEEE Transactions on Emerging Topics in Computing, 4(3), 416–428.
Baralis, E., Cagliero, L., Mahoto, N., & Fiori, A. (2013). GRAPHSUM: Discovering Correlations Among Multiple Terms for Graph-Based Summarization. Information Sciences, 249, 96–109.
Barzilay, R., & McKeown, K. R. (2005). Sentence Fusion for Multidocument News Summarization. Computational Linguistics, 31(3), 297–328.
Chen, K. J., & Bai, M. H. (1998). Unknown Word Detection for Chinese by a Corpus-Based Learning Method. International Journal of Computational Linguistics and Chinese Language Processing, 3(1), 27–44.
Chen, K. J., & Liu, S. H. (1992). Word Identification for Mandarin Chinese Sentences. Proceedings of the 14th Conference on Computational Linguistics, Nantes, France.
Craven, L. (2017, August 7). A Paperless Classroom: Benefits and Challenges. THINK. Retrieved from https://think.iafor.org/a-paperless-classroom-benefits-and-challenges/
Eisenstein, J. (2009, May 31–June 5). Hierarchical Text Segmentation from Multi-Scale Lexical Cohesion. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, Colorado.
Erkan, G., & Radev, D. R. (2004). Lexrank: Graph-based Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research, 22, 457–479.
Felson, M., & Spaeth, J. L. (1978). Community Structure and Collaborative Consumption: A Routine Activity Approach. American Behavioral Scientist, 21(4), 614–624.
Galley, M., McKeown, K., Fosler-Lussier, E., & Jing, H. (2003). Discourse Segmentation of Multi-Party Conversation. Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Sapporo, Japan.
Genest, P. E., & Lapalme, G. (2011). Framework for Abstractive Summarization Using Text-to-Text Generation. Proceedings of the Workshop on Monolingual Text-To-Text Generation, Portland, Oregon.
Grosz, B. J., & Sidner, C. L. (1986). Attention, Intentions, and the Structure of Discourse. Computational Linguistics, 12(3), 175–204.
Gruenstein, A., Niekrasz, J., & Purver, M. (2008). Meeting Structure Annotation. In L. Dybkjær & W. Minker (Eds.), Recent Trends in Discourse and Dialogue (pp. 247–274). Dordrecht: Springer Netherlands.
Hearst, M. A. (1997). TextTiling: Segmenting Text into Multi-Paragraph Subtopic Passages. Computational Linguistics, 23(1), 33–64.
Hsueh, P. Y., Moore, J. D., & Renals, S. (2006, April 3–7). Automatic Segmentation of Multiparty Dialogue. Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy.
Indexes. (2003). Retrieved from http://www.press.uchicago.edu/Misc/Chicago/CHIIndexingComplete.pdf
Kern, R., & Granitzer, M. (2009, October 27–30). Efficient Linear Text Segmentation Based on Information Retrieval Techniques. Proceedings of the International Conference on Management of Emergent Digital EcoSystems, Lyon, France.
Lin, C. Y. (2004). Rouge: A Package for Automatic Evaluation of Summaries. Text Summarization Branches Out, 74–81.
Lin, C. Y., & Hovy, E. (2003). Automatic Evaluation of Summaries Using n-gram Co-occurrence Statistics. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, Canada.
Lloret, E., Plaza, L., & Aker, A. (2018). The Challenging Task of Summary Evaluation: An Overview. Language Resources and Evaluation, 52(1), 101–148.
Lorenz, T. (2014). 40 Years Ago, This is What People Thought the Office of the Future Would Look Like. Business Insider. Retrieved from http://www.businessinsider.com/40-years-ago-this-is-what-people-thought-the-office-of-the-future-would-look-like-2014-12
Louis, A., & Nenkova, A. (2013). Automatically Assessing Machine Summary Content Without a Gold Standard. Computational Linguistics, 39(2), 267–300.
Luhn, H. P. (1958). The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development, 2(2), 159–165.
Luhn, H. P. (1960). Keyword‐in‐Context Index for Technical Literature (KWIC Index). Journal of the Association for Information Science and Technology, 11(4), 288–295.
Ma, W. Y., & Chen, K. J. (2003a, July 11–12). A Bottom-Up Merging Algorithm for Chinese Unknown Word Extraction. Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, Sapporo, Japan.
Ma, W. Y., & Chen, K. J. (2003b, July 11–12). Introduction to CKIP Chinese Word Segmentation System for the First International Chinese Word Segmentation Bakeoff. Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, Sapporo, Japan.
Mihalcea, R., & Tarau, P. (2004, July 25–26). TextRank: Bringing Order into Texts. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain.
Misra, H., Yvon, F., Cappé, O., & Jose, J. (2011). Text Segmentation: A Topic Modeling Perspective. Information Processing and Management, 47(4), 528–544.
Nenkova, A., & McKeown, K. (2012). A Survey of Text Summarization Techniques. In C. C. Aggarwal & C. Zhai (Eds.), Mining Text Data (pp. 43–76). Boston, MA: Springer US.
O’Neill, C. (2017, June 27). A Word of Gratitude to Evernote’s Founder [Blog post]. Retrieved from https://evernote.com/blog/a-word-of-gratitude-to-evernotes-founder/
OneNote. (2017). OneNote Class Notebook supports English Language Learners at Klein Forest High School. Retrieved from https://blogs.office.com/en-us/2017/03/01/onenote-class-notebook-supports-english-language-learners-at-klein-forest-high-school/
Qiang, J. P., Chen, P., Ding, W., Xie, F., & Wu, X. (2016). Multi-Document Summarization Using Closed Patterns. Knowledge-Based Systems, 99, 28–38.
Radev, D. R., Hovy, E., & McKeown, K. (2002). Introduction to the Special Issue on Summarization. Computational Linguistics, 28(4), 399–408.
Radev, D. R., Jing, H., Styś, M., & Tam, D. (2004). Centroid-Based Summarization of Multiple Documents. Information Processing & Management, 40(6), 919–938.
Riedl, M., & Biemann, C. (2012, July 9–11). TopicTiling: A Text Segmentation Algorithm based on LDA. Proceedings of the 2012 Student Research Workshop, Jeju, Republic of Korea.
Saif, H., Fernández, M., He, Y., & Alani, H. (2014, May 26–31). On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter. Proceedings of the Ninth International Conference on Language Resources and Evaluation, Reykjavik, Iceland.
Sankarasubramaniam, Y., Ramanathan, K., & Ghosh, S. (2014). Text Summarization Using Wikipedia. Information Processing and Management, 50(3), 443–461.
Steinberger, J., & Ježek, K. (2004, April 19–22). Using Latent Semantic Analysis in Text Summarization and Summary Evaluation. Proceedings of the 7th International Conference ISIM, Rožnov pod Radhoštěm, Czech Republic.
Torres-Moreno, J. M., Saggion, H., Cunha, I. D., SanJuan, E., & Velázquez-Morales, P. (2010). Summary Evaluation with and without References. Polibits, 42, 13–20.
Toutanova, K., Klein, D., Manning, C. D., & Singer, Y. (2003, May 27–June 1). Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, Canada.
Toutanova, K., & Manning, C. D. (2000, October 7–8). Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong.
Usdan, J., & Gottheimer, J. (2012). FCC Chairman: Digital Textbooks to All Students in Five Years. Retrieved from https://www.fcc.gov/news-events/blog/2012/02/03/fcc-chairman-digital-textbooks-all-students-five-years
Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007). Beyond SumBasic: Task-Focused Summarization with Sentence Simplification and Lexical Expansion. Information Processing & Management, 43(6), 1606–1618.
Wan, X. (2007). A Novel Document Similarity Measure Based on Earth Mover's Distance. Information Sciences, 177(18), 3718–3730.
Wang, D., & Li, T. (2012). Weighted Consensus Multi-Document Summarization. Information Processing and Management, 48(3), 513–523.
Wang, D., Zhu, S., Li, T., Chi, Y., & Gong, Y. (2011). Integrating Document Clustering and Multidocument Summarization. ACM Transactions on Knowledge Discovery from Data, 5(3), 14:11–14:26.
Wosskow, D. (2014). Unlocking the sharing economy: An independent review. Retrieved from https://www.gov.uk/government/publications/unlocking-the-sharing-economy-independent-review/unlocking-the-sharing-economy-independent-review
Wu, J. W., Tseng, J. C. R., & Tsai, W. N. (2011, December 3–4). An Efficient Linear Text Segmentation Algorithm Using Hierarchical Agglomerative Clustering. Proceedings of the 2011 Seventh International Conference on Computational Intelligence and Security, Sanya, Hainan, China.
Wu, Z., & Tseng, G. (1993). Chinese Text Segmentation for Text Retrieval: Achievements and Problems. Journal of the American Society for Information Science, 44(9), 532–542.
Yang, G., Wen, D., Kinshuk, Chen, N., & Sutinen, E. (2012, July 18–20). Personalized Text Content Summarizer for Mobile Learning: An Automatic Text Summarization System with Relevance Based Language Model. Proceedings of the 2012 IEEE Fourth International Conference on Technology for Education, Hyderabad, India.
Ye, Y., Wu, Q., Li, Y., Chow, K., Hui, L. C., & Yiu, S. (2013). Unknown Chinese Word Extraction Based on Variety of Overlapping Strings. Information Processing and Management, 49(2), 497–512.
Zhou, S., & Guan, J. (2002, February 17–23). Chinese Documents Classification Based on N-Grams. Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing, Mexico City, Mexico.
Lee, O.(2016年2月1日)。新年新里程碑:Evernote台灣用戶數破200萬!【部落格文字資料】。取自https://evernote.com/blog/zh-tw/celebrate-2-million-user-milestone-in-taiwan/
Sun, J.(2012年)。jieba: 結巴中文分詞。取自https://github.com/fxsjy/jieba
大嶋祥誉(2015年)。麥肯錫的筆記術:頂尖顧問的思考.書寫技巧(陳惠莉譯)。台灣:天下雜誌。
蔡佩珊(2010年12月15日)。日本電子教科書之推動現況,數位典藏與學習電子報,12。取自 http://newsletter.teldap.tw/news/NewsContent.php?nid=4225&lid=479
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2023-08-01起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2023-08-01起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw