解析
ベルヌーイーベータ分布の事後分布
> p=seq(0,1,length=500) > a=10 > b=3 > s=21 > ns=26 > prior=dbeta(p,a,b) > like=dbeta(p,s,ns) > post=dbeta(p,a+s,b+ns) > Names<-c("prior","likelihood","posterior") > plot(p,like,ylim=range(post),type="l",main="prior,lilelihood,and posterior",xlab="x",ylab="Density",lty=2,lwd=1) > lines(p,prior,type="l",lty=1,lwd=1) > lines(p,post,type="l",lty=3,lwd=1) > legend(0,5,Names,col=1,lty=1:3) |
ナイーブベイズ(Naive Bayes)
ナイーブベイズとは、条件付き独立性を仮定した単純な分類法のことを指す。 Webページの自動カテゴリー分類やスパムフィルタなどに利用されている。 特にこのナイーブベイズの特徴として、語順や単語間の相関関係を考慮に入れていない。
1)訓練データ中の各ラベルの出現頻度を調べて、それぞれのラベルの事前確率を計算
2)事前確率に、それぞれの素性のラベル付けへの貢献度が組み合わされて、 それぞれのラベルの尤度が計算され、最も高い尤度と推定されたラベルが入力値に割り振られる。
ここでは、"kernlab"のパッケージにある サンプルデータspamを用いる。 このデータは、Spam e-mail database (4601通りのメール)には、 [1-57列目]にはスパムメールに特徴的な57種類の単語・記号など(素性) が記述されており、[58列目]にspamであるかnonspamであるか(ラベル) が記録されている。
> library(kernlab) > data(spam) #spamデータを学習データと評価データに2分割 > tr.index<-sample(nrow(spam),as.integer(nrow(spam)*0.5)) > spam.train<-spam[tr.index,] > spam.test<-spam[-tr.index,] > > library(e1071) > model<-naiveBayes(type~.,data=spam.train) > model Naive Bayes Classifier for Discrete Predictors Call: naiveBayes.default(x = X, y = Y, laplace = laplace) A-priori probabilities: Y nonspam spam 0.5982609 0.4017391 Conditional probabilities: make Y [,1] [,2] nonspam 0.07376453 0.3107451 spam 0.14967532 0.3214824 address Y [,1] [,2] nonspam 0.2922602 1.816776 spam 0.1625866 0.329196 all Y [,1] [,2] nonspam 0.1853343 0.4714455 spam 0.3988853 0.4659669 num3d Y [,1] [,2] nonspam 0.0007630814 0.01237295 spam 0.1405735931 2.08201034 our Y [,1] [,2] nonspam 0.1898401 0.6610166 spam 0.5208333 0.6931392 over Y [,1] [,2] nonspam 0.04448401 0.2446386 spam 0.17523810 0.3174882 remove Y [,1] [,2] nonspam 0.0109375 0.1048625 spam 0.2909740 0.6281295 internet Y [,1] [,2] nonspam 0.03468023 0.2146108 spam 0.21385281 0.5703914 order Y [,1] [,2] nonspam 0.04676599 0.2428200 spam 0.17455628 0.3498043 mail Y [,1] [,2] nonspam 0.1781904 0.7692418 spam 0.3242532 0.5337798 receive Y [,1] [,2] nonspam 0.01967297 0.1340848 spam 0.11744589 0.2578563 will Y [,1] [,2] nonspam 0.4931541 0.9332249 spam 0.5451190 0.6700317 people Y [,1] [,2] nonspam 0.06849564 0.2890134 spam 0.14792208 0.3805220 report Y [,1] [,2] nonspam 0.02162791 0.1705332 spam 0.08379870 0.3002757 addresses Y [,1] [,2] nonspam 0.008328488 0.09322721 spam 0.116536797 0.38772207 free Y [,1] [,2] nonspam 0.09153343 0.8284615 spam 0.48649351 0.8956252 business Y [,1] [,2] nonspam 0.04949855 0.2267984 spam 0.27961039 0.6387855 email Y [,1] [,2] nonspam 0.1113663 0.4555041 spam 0.3364502 0.6936851 you Y [,1] [,2] nonspam 1.295538 1.834515 spam 2.199524 1.576138 credit Y [,1] [,2] nonspam 0.005588663 0.07934057 spam 0.182132035 0.62194110 your Y [,1] [,2] nonspam 0.4356759 0.9792147 spam 1.3628571 1.2223474 font Y [,1] [,2] nonspam 0.03786337 0.5322798 spam 0.24167749 1.5125435 num000 Y [,1] [,2] nonspam 0.006235465 0.05578726 spam 0.259556277 0.52747597 money Y [,1] [,2] nonspam 0.01603924 0.2630207 spam 0.19656926 0.5130490 hp Y [,1] [,2] nonspam 0.92818314 2.0957413 spam 0.01589827 0.1601007 hpl Y [,1] [,2] nonspam 0.45960756 1.20543961 spam 0.00784632 0.09869909 george Y [,1] [,2] nonspam 1.256853198 4.17507571 spam 0.002435065 0.04554974 num650 Y [,1] [,2] nonspam 0.21801599 0.6785334 spam 0.02849567 0.4153449 lab Y [,1] [,2] nonspam 0.1716787791 0.73044217 spam 0.0009090909 0.01663568 labs Y [,1] [,2] nonspam 0.182507267 0.6037087 spam 0.006926407 0.0911006 telnet Y [,1] [,2] nonspam 0.117303779 0.57757505 spam 0.001904762 0.04662206 num857 Y [,1] [,2] nonspam 0.084084302 0.43261627 spam 0.001017316 0.02185453 data Y [,1] [,2] nonspam 0.15429506 0.7498340 spam 0.01642857 0.1214887 num415 Y [,1] [,2] nonspam 0.084752907 0.43333977 spam 0.002077922 0.04670289 num85 Y [,1] [,2] nonspam 0.191162791 0.79480016 spam 0.005790043 0.04578109 technology Y [,1] [,2] nonspam 0.14553779 0.4868970 spam 0.02979437 0.1487502 num1999 Y [,1] [,2] nonspam 0.20130814 0.5096179 spam 0.05012987 0.3137661 parts Y [,1] [,2] nonspam 0.016984012 0.25985056 spam 0.005963203 0.06271436 pm Y [,1] [,2] nonspam 0.10350291 0.46125577 spam 0.01233766 0.09183663 direct Y [,1] [,2] nonspam 0.09345930 0.4597816 spam 0.04044372 0.1634167 cs Y [,1] [,2] nonspam 0.0845712209 0.516261348 spam 0.0001082251 0.003289758 meeting Y [,1] [,2] nonspam 0.232245640 1.05697137 spam 0.001818182 0.02240534 original Y [,1] [,2] nonspam 0.072747093 0.28407120 spam 0.008084416 0.04760242 project Y [,1] [,2] nonspam 0.118393895 0.68212564 spam 0.006991342 0.06366247 re Y [,1] [,2] nonspam 0.4144404 1.2385933 spam 0.1314719 0.3566903 edu Y [,1] [,2] nonspam 0.29623547 1.1975767 spam 0.01768398 0.1546024 table Y [,1] [,2] nonspam 0.007841570 0.07898948 spam 0.001114719 0.01904980 conference Y [,1] [,2] nonspam 0.047994186 0.36349442 spam 0.002186147 0.03058448 charSemicolon Y [,1] [,2] nonspam 0.04367515 0.2723684 spam 0.01937662 0.0887151 charRoundbracket Y [,1] [,2] nonspam 0.1589193 0.2468231 spam 0.1191504 0.3739926 charSquarebracket Y [,1] [,2] nonspam 0.024977471 0.15585250 spam 0.009510823 0.05389867 charExclamation Y [,1] [,2] nonspam 0.1006890 0.6691533 spam 0.5227511 0.7599010 charDollar Y [,1] [,2] nonspam 0.01092878 0.05918727 spam 0.19095022 0.40034431 charHash Y [,1] [,2] nonspam 0.02520276 0.2862500 spam 0.07279221 0.4870051 capitalAve Y [,1] [,2] nonspam 2.294866 1.904008 spam 9.177057 47.553399 capitalLong Y [,1] [,2] nonspam 17.17006 24.8506 spam 112.11905 376.5911 capitalTotal Y [,1] [,2] nonspam 153.5291 344.8686 spam 501.9708 954.0788 > pr.model<-predict(model,spam.test[,-58],type="class") > pr.model [1] spam spam spam spam spam spam spam spam [9] spam spam spam spam spam spam spam spam [17] spam spam spam spam spam spam spam spam [25] spam spam spam spam spam spam spam spam [33] spam spam spam spam spam spam spam spam [41] spam spam spam spam spam spam spam spam [49] spam spam spam spam spam spam spam spam [57] spam nonspam spam spam spam spam spam spam [65] spam spam spam spam spam spam spam spam [73] spam spam spam spam spam spam spam spam [81] spam spam spam spam spam spam spam nonspam [89] spam spam spam spam spam spam spam spam [97] spam spam spam spam spam spam spam spam [105] spam spam spam spam spam spam spam spam [113] spam spam spam spam spam spam spam spam [121] nonspam spam spam spam spam spam spam spam [129] spam spam spam spam spam spam spam spam [137] spam spam spam spam spam spam spam spam [145] spam spam spam spam spam spam spam spam [153] spam spam spam spam spam spam spam spam [161] spam spam spam spam spam spam spam spam [169] spam spam spam spam spam spam spam spam [177] spam spam spam spam spam spam spam spam [185] spam spam spam spam spam spam spam spam [193] spam spam spam spam spam spam spam spam [201] spam spam spam spam spam spam spam spam [209] spam spam spam spam spam spam spam spam [217] spam spam nonspam spam spam spam spam spam [225] spam spam spam spam spam spam spam spam [233] spam spam spam spam spam spam spam spam [241] spam spam spam spam spam spam spam spam [249] spam spam spam spam spam spam spam spam [257] spam spam spam spam spam nonspam nonspam spam [265] spam spam spam spam spam spam spam spam [273] spam spam spam spam spam spam nonspam spam [281] spam spam spam spam spam spam spam spam [289] spam spam spam spam spam spam spam spam [297] spam nonspam spam spam spam spam spam spam [305] spam spam spam spam spam spam spam spam [313] spam spam spam spam spam spam spam spam [321] spam spam spam spam spam spam spam spam [329] spam spam spam spam spam spam spam spam [337] spam spam spam spam spam spam spam spam [345] spam spam spam spam spam spam spam spam [353] spam spam spam spam spam spam nonspam spam [361] spam spam spam nonspam spam spam spam spam [369] spam spam spam spam spam spam spam spam [377] spam spam spam nonspam spam spam spam spam [385] spam spam spam spam nonspam spam spam spam [393] spam spam spam spam spam spam spam spam [401] spam spam spam spam spam spam spam spam [409] spam spam spam spam spam spam spam spam [417] spam spam spam spam spam spam spam spam [425] spam spam spam spam spam spam spam spam [433] spam spam spam spam spam spam spam spam [441] spam spam spam spam spam spam spam spam [449] spam spam spam spam spam spam spam spam [457] spam spam spam spam spam spam spam spam [465] spam spam spam spam spam spam spam spam [473] spam spam spam spam spam spam spam spam [481] spam spam spam spam spam spam spam spam [489] spam spam spam spam spam spam spam spam [497] spam nonspam spam spam spam spam spam spam [505] spam spam spam nonspam spam spam spam spam [513] nonspam spam spam spam spam spam spam spam [521] spam spam spam spam spam spam spam spam [529] spam nonspam spam spam spam spam spam spam [537] spam spam spam spam nonspam spam spam spam [545] spam spam spam spam spam spam nonspam spam [553] spam nonspam spam spam spam spam spam spam [561] spam spam spam spam spam spam spam nonspam [569] spam spam spam spam spam spam spam spam [577] spam spam spam spam spam spam spam spam [585] spam spam spam spam spam spam spam spam [593] spam spam spam spam spam spam spam spam [601] nonspam spam spam spam spam spam spam spam [609] spam spam spam spam spam spam spam spam [617] spam spam spam spam spam spam spam spam [625] spam spam nonspam spam spam spam nonspam spam [633] spam spam spam spam spam spam spam spam [641] spam spam spam spam spam spam spam spam [649] spam spam spam spam spam spam spam spam [657] spam spam spam spam spam spam spam spam [665] spam spam spam spam spam spam spam spam [673] spam spam nonspam spam spam spam spam spam [681] spam spam spam spam spam nonspam spam spam [689] spam spam spam spam spam spam spam spam [697] spam spam spam spam spam spam spam spam [705] spam spam spam spam spam nonspam spam spam [713] spam spam spam spam spam spam spam spam [721] spam spam spam spam spam spam spam spam [729] nonspam spam spam spam spam spam spam spam [737] spam spam spam spam spam spam spam spam [745] nonspam spam spam spam spam spam spam spam [753] spam spam spam spam nonspam spam spam spam [761] nonspam nonspam spam spam spam spam spam spam [769] spam spam spam spam spam spam spam spam [777] spam spam spam spam spam spam spam spam [785] spam spam spam nonspam spam spam spam spam [793] spam spam spam spam spam spam nonspam spam [801] spam spam nonspam spam spam spam spam spam [809] spam spam spam spam spam spam spam spam [817] nonspam spam spam spam spam nonspam spam spam [825] spam spam spam spam spam spam spam spam [833] spam spam spam spam spam nonspam spam nonspam [841] spam spam spam nonspam spam spam spam spam [849] spam spam spam spam spam spam spam spam [857] spam spam spam spam nonspam spam spam spam [865] spam nonspam spam spam spam spam spam spam [873] spam spam spam spam spam spam spam spam [881] spam spam spam spam nonspam spam spam spam [889] spam spam nonspam spam spam spam spam nonspam [897] nonspam nonspam nonspam nonspam nonspam nonspam spam spam [905] nonspam nonspam spam nonspam spam nonspam spam nonspam [913] spam nonspam nonspam spam spam nonspam nonspam spam [921] nonspam spam nonspam nonspam spam nonspam nonspam nonspam [929] nonspam nonspam spam spam spam nonspam spam spam [937] nonspam nonspam nonspam nonspam nonspam spam spam nonspam [945] nonspam spam spam nonspam nonspam spam nonspam nonspam [953] nonspam spam spam nonspam nonspam spam nonspam nonspam [961] spam nonspam spam nonspam nonspam spam spam spam [969] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam [977] nonspam spam spam nonspam nonspam spam spam nonspam [985] nonspam nonspam spam nonspam nonspam spam nonspam spam [993] nonspam nonspam nonspam nonspam nonspam nonspam nonspam spam [1001] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam [1009] nonspam nonspam nonspam spam spam nonspam nonspam nonspam [1017] nonspam spam spam nonspam nonspam nonspam nonspam nonspam [1025] nonspam spam nonspam spam nonspam nonspam nonspam nonspam [1033] spam spam spam spam spam nonspam spam nonspam [1041] spam spam spam spam nonspam spam spam nonspam [1049] spam spam nonspam nonspam nonspam spam spam nonspam [1057] spam nonspam nonspam nonspam spam nonspam spam spam [1065] spam spam spam nonspam nonspam nonspam nonspam spam [1073] nonspam nonspam spam nonspam nonspam nonspam spam nonspam [1081] nonspam spam spam nonspam nonspam spam nonspam nonspam [1089] spam nonspam spam nonspam nonspam nonspam nonspam spam [1097] nonspam nonspam nonspam nonspam spam nonspam nonspam spam [1105] spam spam nonspam spam nonspam spam nonspam spam [1113] nonspam nonspam spam spam spam nonspam spam nonspam [1121] spam nonspam spam spam nonspam spam nonspam nonspam [1129] nonspam nonspam nonspam nonspam nonspam spam nonspam spam [1137] nonspam spam nonspam nonspam nonspam nonspam nonspam spam [1145] nonspam nonspam spam spam nonspam nonspam nonspam spam [1153] nonspam spam nonspam spam spam spam spam nonspam [1161] spam nonspam nonspam nonspam spam nonspam spam nonspam [1169] nonspam nonspam spam nonspam nonspam spam nonspam spam [1177] spam nonspam spam spam nonspam spam nonspam nonspam [1185] nonspam nonspam spam nonspam nonspam nonspam nonspam spam [1193] nonspam nonspam spam nonspam nonspam nonspam nonspam nonspam [1201] nonspam nonspam nonspam nonspam nonspam spam spam nonspam [1209] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam [1217] nonspam nonspam nonspam nonspam nonspam nonspam nonspam spam [1225] nonspam nonspam nonspam nonspam spam nonspam nonspam spam [1233] nonspam spam spam spam spam spam spam spam [1241] nonspam spam nonspam nonspam nonspam spam nonspam nonspam [1249] nonspam nonspam nonspam nonspam spam nonspam nonspam nonspam [1257] spam spam spam nonspam nonspam spam nonspam nonspam [1265] nonspam nonspam nonspam nonspam nonspam nonspam spam spam [1273] nonspam nonspam spam nonspam nonspam spam spam nonspam [1281] spam spam nonspam nonspam spam nonspam nonspam nonspam [1289] spam nonspam nonspam spam spam nonspam nonspam nonspam [1297] nonspam nonspam nonspam nonspam spam nonspam spam spam [1305] nonspam spam nonspam nonspam nonspam spam nonspam nonspam [1313] nonspam spam nonspam nonspam nonspam spam nonspam nonspam [1321] spam nonspam nonspam nonspam nonspam nonspam spam spam [1329] nonspam spam nonspam nonspam nonspam spam nonspam spam [1337] nonspam nonspam nonspam nonspam nonspam nonspam spam spam [1345] nonspam nonspam nonspam nonspam nonspam nonspam spam nonspam [1353] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam [1361] nonspam nonspam nonspam nonspam spam spam nonspam nonspam [1369] nonspam nonspam nonspam nonspam spam spam spam spam [1377] nonspam nonspam nonspam spam spam spam nonspam spam [1385] spam nonspam nonspam spam spam spam spam spam [1393] spam spam spam nonspam nonspam nonspam spam spam [1401] spam nonspam nonspam nonspam nonspam nonspam nonspam spam [1409] nonspam nonspam nonspam spam nonspam spam spam nonspam [1417] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam [1425] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam [1433] spam nonspam spam nonspam nonspam spam spam spam [1441] nonspam nonspam nonspam spam nonspam spam nonspam spam [1449] spam nonspam nonspam nonspam nonspam nonspam spam spam [1457] nonspam spam nonspam nonspam nonspam nonspam spam spam [1465] nonspam nonspam nonspam spam spam nonspam nonspam nonspam [1473] nonspam spam nonspam spam spam nonspam nonspam nonspam [1481] nonspam spam nonspam spam nonspam spam nonspam nonspam [1489] nonspam nonspam nonspam nonspam nonspam spam nonspam spam [1497] spam nonspam spam spam spam nonspam spam nonspam [1505] nonspam spam spam spam spam nonspam spam nonspam [1513] nonspam nonspam nonspam spam nonspam nonspam nonspam nonspam [1521] nonspam nonspam nonspam spam nonspam spam nonspam nonspam [1529] nonspam spam nonspam nonspam nonspam spam nonspam nonspam [1537] spam nonspam nonspam spam spam spam spam spam [1545] spam spam spam spam spam spam spam spam [1553] spam spam spam nonspam nonspam nonspam nonspam nonspam [1561] nonspam spam spam spam nonspam spam spam nonspam [1569] spam spam spam nonspam spam nonspam nonspam nonspam [1577] spam nonspam spam nonspam spam nonspam spam nonspam [1585] nonspam nonspam nonspam nonspam spam nonspam spam nonspam [1593] spam spam spam spam spam spam nonspam spam [1601] spam nonspam nonspam nonspam nonspam spam nonspam nonspam [1609] nonspam nonspam nonspam nonspam spam nonspam nonspam nonspam [1617] spam nonspam spam nonspam nonspam spam nonspam nonspam [1625] spam spam nonspam nonspam spam spam spam spam [1633] nonspam nonspam spam nonspam spam nonspam nonspam nonspam [1641] nonspam spam nonspam spam nonspam nonspam spam spam [1649] spam nonspam nonspam nonspam nonspam nonspam spam spam [1657] spam nonspam nonspam nonspam spam nonspam nonspam nonspam [1665] nonspam spam spam spam spam spam nonspam nonspam [1673] spam nonspam nonspam spam spam nonspam nonspam nonspam [1681] nonspam spam nonspam spam nonspam nonspam spam spam [1689] nonspam nonspam spam spam nonspam nonspam spam nonspam [1697] spam nonspam nonspam spam nonspam nonspam spam spam [1705] nonspam nonspam spam nonspam spam spam nonspam spam [1713] nonspam spam nonspam nonspam nonspam spam nonspam nonspam [1721] spam nonspam nonspam spam spam nonspam nonspam spam [1729] spam nonspam nonspam nonspam nonspam nonspam spam nonspam [1737] nonspam spam nonspam spam nonspam nonspam nonspam nonspam [1745] nonspam nonspam spam nonspam nonspam nonspam nonspam spam [1753] nonspam spam spam nonspam nonspam nonspam nonspam nonspam [1761] nonspam nonspam nonspam nonspam nonspam spam spam nonspam [1769] spam spam spam spam spam spam spam spam [1777] spam spam spam spam spam spam spam spam [1785] spam spam spam spam spam spam spam spam [1793] spam spam spam spam spam spam spam spam [1801] spam spam spam spam spam spam spam spam [1809] spam spam spam spam spam spam spam spam [1817] spam spam spam spam spam spam spam spam [1825] spam spam spam spam spam spam spam nonspam [1833] nonspam spam nonspam nonspam spam nonspam nonspam spam [1841] spam nonspam nonspam nonspam spam nonspam nonspam nonspam [1849] nonspam nonspam spam spam nonspam nonspam nonspam nonspam [1857] spam nonspam nonspam nonspam nonspam spam nonspam nonspam [1865] nonspam nonspam nonspam spam spam spam nonspam nonspam [1873] spam spam spam spam spam nonspam spam spam [1881] nonspam spam nonspam spam spam nonspam nonspam nonspam [1889] nonspam nonspam nonspam nonspam nonspam spam nonspam spam [1897] spam spam spam spam nonspam spam nonspam nonspam [1905] nonspam nonspam nonspam nonspam spam nonspam nonspam nonspam [1913] spam spam nonspam nonspam spam nonspam nonspam spam [1921] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam [1929] nonspam nonspam nonspam nonspam spam nonspam nonspam nonspam [1937] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam [1945] spam nonspam nonspam spam nonspam nonspam spam nonspam [1953] spam nonspam spam spam spam spam nonspam nonspam [1961] nonspam nonspam nonspam spam nonspam nonspam nonspam nonspam [1969] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam [1977] spam nonspam spam nonspam spam nonspam nonspam nonspam [1985] spam nonspam nonspam spam nonspam nonspam nonspam spam [1993] spam nonspam nonspam nonspam nonspam spam spam nonspam [2001] spam nonspam nonspam spam spam spam nonspam nonspam [2009] spam nonspam spam nonspam nonspam nonspam nonspam spam [2017] spam nonspam spam spam nonspam spam spam nonspam [2025] nonspam spam nonspam nonspam spam nonspam spam nonspam [2033] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam [2041] nonspam nonspam nonspam spam spam spam nonspam spam [2049] spam spam spam nonspam spam nonspam nonspam nonspam [2057] nonspam nonspam nonspam nonspam spam spam spam nonspam [2065] nonspam spam spam nonspam spam nonspam spam spam [2073] spam spam spam spam spam spam spam spam [2081] spam spam spam nonspam nonspam nonspam spam spam [2089] spam spam nonspam spam spam spam spam nonspam [2097] spam spam spam spam spam nonspam spam spam [2105] spam spam spam nonspam spam spam spam nonspam [2113] spam spam spam spam spam spam spam nonspam [2121] spam nonspam spam spam nonspam spam spam nonspam [2129] spam spam nonspam nonspam nonspam nonspam nonspam nonspam [2137] spam spam nonspam spam nonspam spam nonspam spam [2145] spam spam spam spam spam nonspam spam nonspam [2153] spam spam spam spam spam spam nonspam nonspam [2161] spam nonspam spam nonspam nonspam spam nonspam spam [2169] nonspam nonspam spam spam spam nonspam spam nonspam [2177] spam spam nonspam spam spam spam spam spam [2185] spam spam nonspam spam spam spam spam spam [2193] spam spam spam nonspam nonspam spam spam spam [2201] spam spam nonspam nonspam spam spam nonspam spam [2209] nonspam nonspam nonspam nonspam spam spam spam spam [2217] spam spam spam spam spam spam spam spam [2225] spam spam nonspam spam spam spam nonspam spam [2233] nonspam nonspam spam spam nonspam nonspam nonspam spam [2241] spam nonspam spam spam spam spam spam spam [2249] spam nonspam nonspam spam nonspam spam spam nonspam [2257] spam nonspam nonspam nonspam spam nonspam spam nonspam [2265] nonspam nonspam spam spam nonspam nonspam spam nonspam [2273] spam nonspam spam spam spam nonspam nonspam spam [2281] nonspam spam nonspam spam spam nonspam spam spam [2289] nonspam spam nonspam nonspam nonspam nonspam nonspam spam [2297] spam spam nonspam spam spam Levels: nonspam spam > table(spam.test[,58],pr.model) pr.model nonspam spam nonspam 778 634 spam 42 847 > result<-table(spam.test$type,pr.model) > 2*result[2,2]/(2*result[2,2]+result[1,2]+result[2,1]) [1] 0.7147679 |
マルコフ連鎖モンテカルロ法(MCMC)
ギッブスサンプリング
ある正規母集団から15個の観測値yを得た場合のパラメータ(μ, σ2)の事後分布をギッブスサンプリングを用いて評価する。 この際の事前分布はμ〜N(μ0、σ20)、σ2〜IG(r0/2,s0/2)を仮定する。
> y<-c(1.53,19.02,5.34,-2.16,0.83,6.74,5.53,0.65,-0.49,-0.08,5.31,3.18,7.19,4.23,6.45) > my<-mean(y) > n<-length(y) > iterations<-3500 > m0<-0 > s0<-100 > a0<-0.01 > b0<-0.01 > theta<-matrix(nrow=iterations,ncol=2) > cur.mu<-0 > cur.tau<-1 > cur.s<-sqrt(1/cur.tau) > for(t in 1:iterations){ + w<-s0^2/(cur.s^2/n+s0^2) + m<-w*my+(1-w)*m0 + s<-sqrt(w+cur.s^2/2) + cur.mu<-rnorm(1,m,s) + a<-a0+0.5*n + b<-b0+0.5*sum((y-cur.mu)^2) + cur.tau<-rgamma(1,a,b) + cur.s<-sqrt(1/cur.tau) + theta[t,]<-c(cur.mu,cur.s)} > op<-par(mfrow=c(2,1)) > p=1:iterations > plot(p,theta[,1],type="l",main="mu",xlab="iteration",ylab="mu",lty=1,lwd=1) > plot(p,theta[,2],type="l",main="sigma",xlab="iteration",ylab="sigma",lty=1,lwd=1) > par(op) |
メトロポリス-ヘイスティングス
線形回帰モデル
正規線形回帰モデル:ギッブスサンプリング
> library(MCMCpack) > regdata<-list(X=c(-2,-1,0,1,2,3),Y=c(1,3,3,3,5,6)) > posterior<-MCMCregress(Y~X,data=regdata,burnin=1000,mcmc=1000,b0=0.0,B0=0,c0=2,d0=0.001,verbose=1000) MCMCregress iteration 1 of 2000 beta = 3.15047 0.87636 sigma2 = 0.18268 MCMCregress iteration 1001 of 2000 beta = 2.80864 0.85773 sigma2 = 0.12893 > plot(posterior) > summary(posterior) Iterations = 1001:2000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 1000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) 3.0691 0.2536 0.008019 0.007150 X 0.8854 0.1483 0.004691 0.003937 sigma2 0.4067 0.3146 0.009950 0.012335 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) 2.5580 2.9199 3.0691 3.2250 3.587 X 0.5911 0.7966 0.8849 0.9771 1.184 sigma2 0.1197 0.2167 0.3189 0.4743 1.298 |
トービットモデル:ギッブスサンプリング
> library(survival) > example(tobin) tobin> tfit <- survreg(Surv(durable, durable>0, type='left') ~age + quant, tobin+ data=tobin, dist='gaussian') tobin> predict(tfit,type="response") [1] -3.04968679 -4.31254182 -0.54163315 -0.25607164 -1.85017727 -2.40987803 [7] -3.50629220 -0.74041486 -4.05145594 -3.55880518 -0.32223237 -3.68044619 [13] -3.65997456 -2.63254564 0.22382063 0.02177674 -0.09571284 -3.17696755 [19] -0.61521215 -3.13913903 > summary(tfit) Call: survreg(formula = Surv(durable, durable > 0, type = "left") ~ age + quant, data = tobin, dist = "gaussian") Value Std. Error z p (Intercept) 15.1449 16.0795 0.942 3.46e-01 age -0.1291 0.2186 -0.590 5.55e-01 quant -0.0455 0.0583 -0.782 4.34e-01 Log(scale) 1.7179 0.3103 5.536 3.10e-08 Scale= 5.57 Gaussian distribution Loglik(model)= -28.9 Loglik(intercept only)= -29.5 Chisq= 1.1 on 2 degrees of freedom, p= 0.58 Number of Newton-Raphson Iterations: 3 n= 20 > tfit.mcmc<-MCMCtobit(durable~age+quant,data=tobin,mcmc=20000,b0=0.0,B0=0,c0=0.001,d0=0.001,verbose=1000) MCMCtobit iteration 1 of 21000 beta = 14.53324 -0.17875 -0.02518 sigma2 = 8.41402 MCMCtobit iteration 1001 of 21000 beta = 52.55459 -0.55878 -0.15851 sigma2 = 454.21928 MCMCtobit iteration 2001 of 21000 beta = 19.96620 -0.17549 -0.07358 sigma2 = 73.64337 MCMCtobit iteration 3001 of 21000 beta = 215.85382 -4.63364 -0.39967 sigma2 = 7795.87892 MCMCtobit iteration 4001 of 21000 beta = 36.92052 -1.40173 0.05722 sigma2 = 818.75771 MCMCtobit iteration 5001 of 21000 beta = -20.32282 0.56543 -0.05620 sigma2 = 329.88211 MCMCtobit iteration 6001 of 21000 beta = -19.99355 0.29027 -0.02940 sigma2 = 158.42768 MCMCtobit iteration 7001 of 21000 beta = 54.44763 -0.19122 -0.19175 sigma2 = 84.79789 MCMCtobit iteration 8001 of 21000 beta = 21.27023 -1.47644 0.15638 sigma2 = 209.03197 MCMCtobit iteration 9001 of 21000 beta = 19.55942 0.00574 -0.11454 sigma2 = 178.02732 MCMCtobit iteration 10001 of 21000 beta = -23.13812 -0.37182 0.11873 sigma2 = 273.72350 MCMCtobit iteration 11001 of 21000 beta = -36.96670 -0.38160 0.19984 sigma2 = 230.54203 MCMCtobit iteration 12001 of 21000 beta = 27.92896 -0.05697 -0.12680 sigma2 = 54.93730 MCMCtobit iteration 13001 of 21000 beta = 38.62670 -0.31182 -0.11833 sigma2 = 84.74766 MCMCtobit iteration 14001 of 21000 beta = 20.04418 -0.19912 -0.05552 sigma2 = 32.96554 MCMCtobit iteration 15001 of 21000 beta = 4.07612 0.30036 -0.10512 sigma2 = 199.13058 MCMCtobit iteration 16001 of 21000 beta = 9.04425 0.12401 -0.08732 sigma2 = 49.02667 MCMCtobit iteration 17001 of 21000 beta = 23.73888 -0.07277 -0.09530 sigma2 = 61.10130 MCMCtobit iteration 18001 of 21000 beta = 11.13250 -0.18867 -0.02441 sigma2 = 21.97008 MCMCtobit iteration 19001 of 21000 beta = 50.45571 -0.33907 -0.21748 sigma2 = 282.25100 MCMCtobit iteration 20001 of 21000 beta = 43.70535 -0.17534 -0.18929 sigma2 = 91.96757 > plot(tfit.mcmc) > summary(tfit.mcmc) Iterations = 1001:21000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 20000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) 18.53980 39.2003 0.277188 0.499879 age -0.29145 0.5818 0.004114 0.009942 quant -0.04648 0.1422 0.001006 0.001427 sigma2 173.09722 361.0501 2.553010 12.107377 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) -55.1223 -0.5865 17.6714 36.12702 98.4133 age -1.6044 -0.5153 -0.2242 0.01572 0.6172 quant -0.3281 -0.1124 -0.0460 0.01921 0.2295 sigma2 19.6001 48.3753 86.6096 173.74384 823.2478 |
2項プロビットモデル:ギッブスサンプリング
> data(birthwt) > posterior<-MCMCprobit(low~smoke,data=birthwt) > plot(posterior) > summary(posterior) Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) -0.6704 0.1257 0.001257 0.002232 smoke 0.4355 0.1941 0.001941 0.002763 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) -0.9186 -0.7555 -0.6710 -0.5856 -0.4275 smoke 0.0575 0.3056 0.4341 0.5665 0.8181 |
2項ロジットモデル:M-Hサンプリング
> data(birthwt) > posterior<-MCMClogit(low~age+smoke,b0=0,B0=0.001,burnin=1000,mcmc=10000,data=birthwt) @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ The Metropolis acceptance rate for beta was 0.41109 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > plot(posterior) > summary(posterior) Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) 0.09399 0.78373 0.0078373 0.024541 age -0.05192 0.03295 0.0003295 0.001068 smoke 0.69249 0.32523 0.0032523 0.010616 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) -1.47912 -0.42880 0.08399 0.62812 1.5886 age -0.11908 -0.07403 -0.05057 -0.03039 0.0123 smoke 0.05021 0.47021 0.68683 0.91203 1.3518 |
WinBUGSの導入
http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml
32Bit版のOSを使用している場合、Quick startからWinBugs14.exeをダウンロードし、インストール。
64Bit版の場合は、1.Zipファイルごとダウンロードし、展開したものをProgram Filesへ移動。
2.WinBugを起動し、[File]→[Open]を選択し、パッチファイルを開く。 [Tools]→[Decode]を選択し、[Decode ALL]を選択。
Back to R