進階搜尋


下載電子全文  
系統識別號 U0026-0607201616210000
論文名稱(中文) 利用經驗貝氏方法估計錯誤發現率
論文名稱(英文) Estimation of False Discovery Rate Using Empirical Bayes Method
校院名稱 成功大學
系所名稱(中) 統計學系
系所名稱(英) Department of Statistics
學年度 104
學期 2
出版年 105
研究生(中文) 鄭暘諭
研究生(英文) Yang-Yu Cheng
學號 r26034111
學位類別 碩士
語文別 中文
論文頁數 31頁
口試委員 口試委員-陳俞成
口試委員-林億雄
指導教授-馬瀰嘉
中文關鍵字 EM演算法  貝氏分析  整體型I誤發生率  錯誤發生率 
英文關鍵字 EM Algorithm  Bayesian Approach  FWER  FDR 
學科別分類
中文摘要 在多重檢定的問題中,如果不調整個別檢定之顯著水準,仍設定α,則m個檢定的整體犯錯率就會膨脹為mα。過去文獻顯示當虛無假設是錯的情況下,控制整體型I誤發生率(familywise error rate; FWER)的方法會出現較低的檢定力,和個別型I錯誤發生率(type I error rate)低於顯著水準的問題。同時對多個假設檢定進行比較時,首要問題是如何控制型I錯誤發生率,廣為熟知的是控制FWER,另一可能的解決辦法為控制錯誤發生率(false discovery rate; FDR),無論是FWER或是FDR,要能改善當虛無假設不為真時,所帶來較低檢定力的問題,可以針對虛無假設為真的個數給一較精確的估計。
本篇假設數個基因資料分別呈混合型常態分配,及假設參數具先驗分配,利用貝氏驗後分配和EM演算法估計分配中虛無假設為真的比例,進而估計虛無假設為真時的個數和FDR。
當基因個數足夠且病人個數較大的情況,真實虛無假設為真的比例越高,提出之EBay估計越能有較小的RMSE,估計越精確,且透過蒙地卡羅演算法可模擬不同參數組合下的表現性質,Ma & Chao (2011)應用McNemar檢定之估計方法,若維持設定顯著水準α=0.05會造成估計誤差偏大,Benjamini & Hochberg (2000)提出之估計方法在設定基因突變比例為隨機的情況下表現並不穩定,Ma & Tsai (2011)應用傅萊得曼檢定之估計方法亦有相同情形。
英文摘要 In multiple testing problems, if you do not adjust the individual type I error rate and still set the individual significance level α, then the overall type I error rate of m hypotheses will be expanded to be mα.

This study assumes that several genes have mixed normal distribution, and parameters have prior distribution. We use the Bayesian posterior distribution and EM algorithm to estimate the proportion of the null hypothesis which is true, then to estimate the number of null hypothesis which is true, and FDR.

We compare the performance of these estimators for different parameters through the Monte Carlo algorithm. The estimator using McNemar test proposed by Ma & Chao (2011) may cause estimation error too large as the significance level is set to be α=0.05. The estimator proposed by Benjamini & Hochberg (2000) is unstable when the ratio of gene mutation is set to be random. The estimator using Friedman test proposed by Ma & Tsai (2011) also has the same scenario. When the number of genes and the number of patients both are large and the proportion of true null hypothesis is higher, the proposed EBay estimator has the smaller RMSE. Hence it’s more accurate.
論文目次 第一章 緒論 1
第二章 文獻探討 3
第三章 研究方法 8
第四章 統計模擬 13
第一節 實例分析 13
第二節 模擬程序 15
第三節 模擬結果 16
第五章 結論與建議 18
參考文獻 20
附錄 22
參考文獻 一、英文
1.Benjamini, Y., Hochberg, Y. (1995). “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing”, Journal of the Royal Statistical Society, B 57, pp.289-300.
2.Benjamini, Y., Hochberg, Y. (2000). “On the Adaptive Control of the False Discovery Rate in Multiple Testing with Independent Statistics”, Journal of Educational and Behavioral Statistics, 25, pp.60-83.
3.Benjamini, Y. and Liu, W. (1999). “A Step-down Multiple Hypotheses Testing Procedure that Controls the False Discovery Rate under Independence”, Journal of Statistical Planning and Inference, 82(1-2), pp.163-170.
4.Diebolt J., Robert, C.P. (1994). “Estimation of Finite Mixture Distributions through Bayesian sampling”, Journal of the Royal Statistical Society, Series B, 56 (2), pp.363-375.
5.Efron, B. (2007). “Correlation and Large-scale Simultaneous Significance Testing”, Journal of the American Statistical Association, 102, pp.93-103.
6.Fraley, C. and Raftery, A.E. (2007). “Bayesian Regularization for Normal Mixture Estimation and Model-based Clustering”, Journal of Classification, 24, pp.155-181.
7.Friguet, C., Kloareg, M., Causeur, D. (2009). “A Factor Model Approach to Multiple Testing under Dependence”, Journal of the American Statistical Association, 104, pp.1406-1415.
8.Gordon, A., Glazko G., Qiu X., and Yakovlev A. (2007). “Control of the Mean Number of False Discoveries, Bonferroni and Stability of Multiple Testing”, The Annals of Applied Statistics, 1 (1), pp.179-190.
9.Hsueh, H. M., Chen, J. J. and Kodel, R. L. (2003). “Comparison of Methods for Estimating the Number of True Null Hypotheses in Multiplicity Testing”, Journal of Biopharmaceutical Statistics, 13, pp.675-689.
10.Liang, L. (2009). “On Simulation Methods for Two Component Normal Mixture Models under Bayesian Approach”, U.U.D.M. Project Report (2009), pp.17.
http://www.diva-portal.org/smash/get/diva2:300849/FULLTEXT01.pdf
11.Ma, M. C., Chao, W. C. (2011). “A Nonparametric Approach of Estimating the Number of True Null Hypotheses in Multiple Testing”, International Statistical Institute, August, Ireland, pp.4669-4674 .
12.Ma, M. C., Tsai, C. Y. (2011). “A Nonparametric Approach to Estimate the Number of True Null Hypotheses in Multiple Testing under Dependency”, Master essay of Department of Statistics, NCKU.
13.Storey, J. D., Dai, J. Y., and Leek, J. T. (2007). “The Optimal Discovery Procedure for Large-scale Significance Testing, with Application to Comparative Microarray Experiments”, Biostatistics, 8, pp.414-432.
14.Titterington, D. M. (1985). “Statistical Analysis of Finite Mixture Distributions”, 1st Ed., Wiley, New York.

二、中文
1.林育興(2010),「以混合Beta模型估計多重比較檢定下虛無假設為真的比例」,國立臺北大學統計學系碩士論文。
2.許乾柚(2008),「利用混合模型估計多重比較中真實虛無假設個數」,國立臺北大學統計學系碩士論文。
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2018-07-01起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2018-07-01起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw