解析
ベルヌーイーベータ分布の事後分布
> p=seq(0,1,length=500)
> a=10
> b=3
> s=21
> ns=26
> prior=dbeta(p,a,b)
> like=dbeta(p,s,ns)
> post=dbeta(p,a+s,b+ns)
> Names<-c("prior","likelihood","posterior")
> plot(p,like,ylim=range(post),type="l",main="prior,lilelihood,and posterior",xlab="x",ylab="Density",lty=2,lwd=1)
> lines(p,prior,type="l",lty=1,lwd=1)
> lines(p,post,type="l",lty=3,lwd=1)
> legend(0,5,Names,col=1,lty=1:3)
|
ナイーブベイズ(Naive Bayes)
ナイーブベイズとは、条件付き独立性を仮定した単純な分類法のことを指す。 Webページの自動カテゴリー分類やスパムフィルタなどに利用されている。 特にこのナイーブベイズの特徴として、語順や単語間の相関関係を考慮に入れていない。
1)訓練データ中の各ラベルの出現頻度を調べて、それぞれのラベルの事前確率を計算
2)事前確率に、それぞれの素性のラベル付けへの貢献度が組み合わされて、 それぞれのラベルの尤度が計算され、最も高い尤度と推定されたラベルが入力値に割り振られる。
ここでは、"kernlab"のパッケージにある サンプルデータspamを用いる。 このデータは、Spam e-mail database (4601通りのメール)には、 [1-57列目]にはスパムメールに特徴的な57種類の単語・記号など(素性) が記述されており、[58列目]にspamであるかnonspamであるか(ラベル) が記録されている。
> library(kernlab)
> data(spam)
#spamデータを学習データと評価データに2分割
> tr.index<-sample(nrow(spam),as.integer(nrow(spam)*0.5))
> spam.train<-spam[tr.index,]
> spam.test<-spam[-tr.index,]
>
> library(e1071)
> model<-naiveBayes(type~.,data=spam.train)
> model
Naive Bayes Classifier for Discrete Predictors
Call:
naiveBayes.default(x = X, y = Y, laplace = laplace)
A-priori probabilities:
Y
nonspam spam
0.5982609 0.4017391
Conditional probabilities:
make
Y [,1] [,2]
nonspam 0.07376453 0.3107451
spam 0.14967532 0.3214824
address
Y [,1] [,2]
nonspam 0.2922602 1.816776
spam 0.1625866 0.329196
all
Y [,1] [,2]
nonspam 0.1853343 0.4714455
spam 0.3988853 0.4659669
num3d
Y [,1] [,2]
nonspam 0.0007630814 0.01237295
spam 0.1405735931 2.08201034
our
Y [,1] [,2]
nonspam 0.1898401 0.6610166
spam 0.5208333 0.6931392
over
Y [,1] [,2]
nonspam 0.04448401 0.2446386
spam 0.17523810 0.3174882
remove
Y [,1] [,2]
nonspam 0.0109375 0.1048625
spam 0.2909740 0.6281295
internet
Y [,1] [,2]
nonspam 0.03468023 0.2146108
spam 0.21385281 0.5703914
order
Y [,1] [,2]
nonspam 0.04676599 0.2428200
spam 0.17455628 0.3498043
mail
Y [,1] [,2]
nonspam 0.1781904 0.7692418
spam 0.3242532 0.5337798
receive
Y [,1] [,2]
nonspam 0.01967297 0.1340848
spam 0.11744589 0.2578563
will
Y [,1] [,2]
nonspam 0.4931541 0.9332249
spam 0.5451190 0.6700317
people
Y [,1] [,2]
nonspam 0.06849564 0.2890134
spam 0.14792208 0.3805220
report
Y [,1] [,2]
nonspam 0.02162791 0.1705332
spam 0.08379870 0.3002757
addresses
Y [,1] [,2]
nonspam 0.008328488 0.09322721
spam 0.116536797 0.38772207
free
Y [,1] [,2]
nonspam 0.09153343 0.8284615
spam 0.48649351 0.8956252
business
Y [,1] [,2]
nonspam 0.04949855 0.2267984
spam 0.27961039 0.6387855
email
Y [,1] [,2]
nonspam 0.1113663 0.4555041
spam 0.3364502 0.6936851
you
Y [,1] [,2]
nonspam 1.295538 1.834515
spam 2.199524 1.576138
credit
Y [,1] [,2]
nonspam 0.005588663 0.07934057
spam 0.182132035 0.62194110
your
Y [,1] [,2]
nonspam 0.4356759 0.9792147
spam 1.3628571 1.2223474
font
Y [,1] [,2]
nonspam 0.03786337 0.5322798
spam 0.24167749 1.5125435
num000
Y [,1] [,2]
nonspam 0.006235465 0.05578726
spam 0.259556277 0.52747597
money
Y [,1] [,2]
nonspam 0.01603924 0.2630207
spam 0.19656926 0.5130490
hp
Y [,1] [,2]
nonspam 0.92818314 2.0957413
spam 0.01589827 0.1601007
hpl
Y [,1] [,2]
nonspam 0.45960756 1.20543961
spam 0.00784632 0.09869909
george
Y [,1] [,2]
nonspam 1.256853198 4.17507571
spam 0.002435065 0.04554974
num650
Y [,1] [,2]
nonspam 0.21801599 0.6785334
spam 0.02849567 0.4153449
lab
Y [,1] [,2]
nonspam 0.1716787791 0.73044217
spam 0.0009090909 0.01663568
labs
Y [,1] [,2]
nonspam 0.182507267 0.6037087
spam 0.006926407 0.0911006
telnet
Y [,1] [,2]
nonspam 0.117303779 0.57757505
spam 0.001904762 0.04662206
num857
Y [,1] [,2]
nonspam 0.084084302 0.43261627
spam 0.001017316 0.02185453
data
Y [,1] [,2]
nonspam 0.15429506 0.7498340
spam 0.01642857 0.1214887
num415
Y [,1] [,2]
nonspam 0.084752907 0.43333977
spam 0.002077922 0.04670289
num85
Y [,1] [,2]
nonspam 0.191162791 0.79480016
spam 0.005790043 0.04578109
technology
Y [,1] [,2]
nonspam 0.14553779 0.4868970
spam 0.02979437 0.1487502
num1999
Y [,1] [,2]
nonspam 0.20130814 0.5096179
spam 0.05012987 0.3137661
parts
Y [,1] [,2]
nonspam 0.016984012 0.25985056
spam 0.005963203 0.06271436
pm
Y [,1] [,2]
nonspam 0.10350291 0.46125577
spam 0.01233766 0.09183663
direct
Y [,1] [,2]
nonspam 0.09345930 0.4597816
spam 0.04044372 0.1634167
cs
Y [,1] [,2]
nonspam 0.0845712209 0.516261348
spam 0.0001082251 0.003289758
meeting
Y [,1] [,2]
nonspam 0.232245640 1.05697137
spam 0.001818182 0.02240534
original
Y [,1] [,2]
nonspam 0.072747093 0.28407120
spam 0.008084416 0.04760242
project
Y [,1] [,2]
nonspam 0.118393895 0.68212564
spam 0.006991342 0.06366247
re
Y [,1] [,2]
nonspam 0.4144404 1.2385933
spam 0.1314719 0.3566903
edu
Y [,1] [,2]
nonspam 0.29623547 1.1975767
spam 0.01768398 0.1546024
table
Y [,1] [,2]
nonspam 0.007841570 0.07898948
spam 0.001114719 0.01904980
conference
Y [,1] [,2]
nonspam 0.047994186 0.36349442
spam 0.002186147 0.03058448
charSemicolon
Y [,1] [,2]
nonspam 0.04367515 0.2723684
spam 0.01937662 0.0887151
charRoundbracket
Y [,1] [,2]
nonspam 0.1589193 0.2468231
spam 0.1191504 0.3739926
charSquarebracket
Y [,1] [,2]
nonspam 0.024977471 0.15585250
spam 0.009510823 0.05389867
charExclamation
Y [,1] [,2]
nonspam 0.1006890 0.6691533
spam 0.5227511 0.7599010
charDollar
Y [,1] [,2]
nonspam 0.01092878 0.05918727
spam 0.19095022 0.40034431
charHash
Y [,1] [,2]
nonspam 0.02520276 0.2862500
spam 0.07279221 0.4870051
capitalAve
Y [,1] [,2]
nonspam 2.294866 1.904008
spam 9.177057 47.553399
capitalLong
Y [,1] [,2]
nonspam 17.17006 24.8506
spam 112.11905 376.5911
capitalTotal
Y [,1] [,2]
nonspam 153.5291 344.8686
spam 501.9708 954.0788
> pr.model<-predict(model,spam.test[,-58],type="class")
> pr.model
[1] spam spam spam spam spam spam spam spam
[9] spam spam spam spam spam spam spam spam
[17] spam spam spam spam spam spam spam spam
[25] spam spam spam spam spam spam spam spam
[33] spam spam spam spam spam spam spam spam
[41] spam spam spam spam spam spam spam spam
[49] spam spam spam spam spam spam spam spam
[57] spam nonspam spam spam spam spam spam spam
[65] spam spam spam spam spam spam spam spam
[73] spam spam spam spam spam spam spam spam
[81] spam spam spam spam spam spam spam nonspam
[89] spam spam spam spam spam spam spam spam
[97] spam spam spam spam spam spam spam spam
[105] spam spam spam spam spam spam spam spam
[113] spam spam spam spam spam spam spam spam
[121] nonspam spam spam spam spam spam spam spam
[129] spam spam spam spam spam spam spam spam
[137] spam spam spam spam spam spam spam spam
[145] spam spam spam spam spam spam spam spam
[153] spam spam spam spam spam spam spam spam
[161] spam spam spam spam spam spam spam spam
[169] spam spam spam spam spam spam spam spam
[177] spam spam spam spam spam spam spam spam
[185] spam spam spam spam spam spam spam spam
[193] spam spam spam spam spam spam spam spam
[201] spam spam spam spam spam spam spam spam
[209] spam spam spam spam spam spam spam spam
[217] spam spam nonspam spam spam spam spam spam
[225] spam spam spam spam spam spam spam spam
[233] spam spam spam spam spam spam spam spam
[241] spam spam spam spam spam spam spam spam
[249] spam spam spam spam spam spam spam spam
[257] spam spam spam spam spam nonspam nonspam spam
[265] spam spam spam spam spam spam spam spam
[273] spam spam spam spam spam spam nonspam spam
[281] spam spam spam spam spam spam spam spam
[289] spam spam spam spam spam spam spam spam
[297] spam nonspam spam spam spam spam spam spam
[305] spam spam spam spam spam spam spam spam
[313] spam spam spam spam spam spam spam spam
[321] spam spam spam spam spam spam spam spam
[329] spam spam spam spam spam spam spam spam
[337] spam spam spam spam spam spam spam spam
[345] spam spam spam spam spam spam spam spam
[353] spam spam spam spam spam spam nonspam spam
[361] spam spam spam nonspam spam spam spam spam
[369] spam spam spam spam spam spam spam spam
[377] spam spam spam nonspam spam spam spam spam
[385] spam spam spam spam nonspam spam spam spam
[393] spam spam spam spam spam spam spam spam
[401] spam spam spam spam spam spam spam spam
[409] spam spam spam spam spam spam spam spam
[417] spam spam spam spam spam spam spam spam
[425] spam spam spam spam spam spam spam spam
[433] spam spam spam spam spam spam spam spam
[441] spam spam spam spam spam spam spam spam
[449] spam spam spam spam spam spam spam spam
[457] spam spam spam spam spam spam spam spam
[465] spam spam spam spam spam spam spam spam
[473] spam spam spam spam spam spam spam spam
[481] spam spam spam spam spam spam spam spam
[489] spam spam spam spam spam spam spam spam
[497] spam nonspam spam spam spam spam spam spam
[505] spam spam spam nonspam spam spam spam spam
[513] nonspam spam spam spam spam spam spam spam
[521] spam spam spam spam spam spam spam spam
[529] spam nonspam spam spam spam spam spam spam
[537] spam spam spam spam nonspam spam spam spam
[545] spam spam spam spam spam spam nonspam spam
[553] spam nonspam spam spam spam spam spam spam
[561] spam spam spam spam spam spam spam nonspam
[569] spam spam spam spam spam spam spam spam
[577] spam spam spam spam spam spam spam spam
[585] spam spam spam spam spam spam spam spam
[593] spam spam spam spam spam spam spam spam
[601] nonspam spam spam spam spam spam spam spam
[609] spam spam spam spam spam spam spam spam
[617] spam spam spam spam spam spam spam spam
[625] spam spam nonspam spam spam spam nonspam spam
[633] spam spam spam spam spam spam spam spam
[641] spam spam spam spam spam spam spam spam
[649] spam spam spam spam spam spam spam spam
[657] spam spam spam spam spam spam spam spam
[665] spam spam spam spam spam spam spam spam
[673] spam spam nonspam spam spam spam spam spam
[681] spam spam spam spam spam nonspam spam spam
[689] spam spam spam spam spam spam spam spam
[697] spam spam spam spam spam spam spam spam
[705] spam spam spam spam spam nonspam spam spam
[713] spam spam spam spam spam spam spam spam
[721] spam spam spam spam spam spam spam spam
[729] nonspam spam spam spam spam spam spam spam
[737] spam spam spam spam spam spam spam spam
[745] nonspam spam spam spam spam spam spam spam
[753] spam spam spam spam nonspam spam spam spam
[761] nonspam nonspam spam spam spam spam spam spam
[769] spam spam spam spam spam spam spam spam
[777] spam spam spam spam spam spam spam spam
[785] spam spam spam nonspam spam spam spam spam
[793] spam spam spam spam spam spam nonspam spam
[801] spam spam nonspam spam spam spam spam spam
[809] spam spam spam spam spam spam spam spam
[817] nonspam spam spam spam spam nonspam spam spam
[825] spam spam spam spam spam spam spam spam
[833] spam spam spam spam spam nonspam spam nonspam
[841] spam spam spam nonspam spam spam spam spam
[849] spam spam spam spam spam spam spam spam
[857] spam spam spam spam nonspam spam spam spam
[865] spam nonspam spam spam spam spam spam spam
[873] spam spam spam spam spam spam spam spam
[881] spam spam spam spam nonspam spam spam spam
[889] spam spam nonspam spam spam spam spam nonspam
[897] nonspam nonspam nonspam nonspam nonspam nonspam spam spam
[905] nonspam nonspam spam nonspam spam nonspam spam nonspam
[913] spam nonspam nonspam spam spam nonspam nonspam spam
[921] nonspam spam nonspam nonspam spam nonspam nonspam nonspam
[929] nonspam nonspam spam spam spam nonspam spam spam
[937] nonspam nonspam nonspam nonspam nonspam spam spam nonspam
[945] nonspam spam spam nonspam nonspam spam nonspam nonspam
[953] nonspam spam spam nonspam nonspam spam nonspam nonspam
[961] spam nonspam spam nonspam nonspam spam spam spam
[969] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam
[977] nonspam spam spam nonspam nonspam spam spam nonspam
[985] nonspam nonspam spam nonspam nonspam spam nonspam spam
[993] nonspam nonspam nonspam nonspam nonspam nonspam nonspam spam
[1001] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam
[1009] nonspam nonspam nonspam spam spam nonspam nonspam nonspam
[1017] nonspam spam spam nonspam nonspam nonspam nonspam nonspam
[1025] nonspam spam nonspam spam nonspam nonspam nonspam nonspam
[1033] spam spam spam spam spam nonspam spam nonspam
[1041] spam spam spam spam nonspam spam spam nonspam
[1049] spam spam nonspam nonspam nonspam spam spam nonspam
[1057] spam nonspam nonspam nonspam spam nonspam spam spam
[1065] spam spam spam nonspam nonspam nonspam nonspam spam
[1073] nonspam nonspam spam nonspam nonspam nonspam spam nonspam
[1081] nonspam spam spam nonspam nonspam spam nonspam nonspam
[1089] spam nonspam spam nonspam nonspam nonspam nonspam spam
[1097] nonspam nonspam nonspam nonspam spam nonspam nonspam spam
[1105] spam spam nonspam spam nonspam spam nonspam spam
[1113] nonspam nonspam spam spam spam nonspam spam nonspam
[1121] spam nonspam spam spam nonspam spam nonspam nonspam
[1129] nonspam nonspam nonspam nonspam nonspam spam nonspam spam
[1137] nonspam spam nonspam nonspam nonspam nonspam nonspam spam
[1145] nonspam nonspam spam spam nonspam nonspam nonspam spam
[1153] nonspam spam nonspam spam spam spam spam nonspam
[1161] spam nonspam nonspam nonspam spam nonspam spam nonspam
[1169] nonspam nonspam spam nonspam nonspam spam nonspam spam
[1177] spam nonspam spam spam nonspam spam nonspam nonspam
[1185] nonspam nonspam spam nonspam nonspam nonspam nonspam spam
[1193] nonspam nonspam spam nonspam nonspam nonspam nonspam nonspam
[1201] nonspam nonspam nonspam nonspam nonspam spam spam nonspam
[1209] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam
[1217] nonspam nonspam nonspam nonspam nonspam nonspam nonspam spam
[1225] nonspam nonspam nonspam nonspam spam nonspam nonspam spam
[1233] nonspam spam spam spam spam spam spam spam
[1241] nonspam spam nonspam nonspam nonspam spam nonspam nonspam
[1249] nonspam nonspam nonspam nonspam spam nonspam nonspam nonspam
[1257] spam spam spam nonspam nonspam spam nonspam nonspam
[1265] nonspam nonspam nonspam nonspam nonspam nonspam spam spam
[1273] nonspam nonspam spam nonspam nonspam spam spam nonspam
[1281] spam spam nonspam nonspam spam nonspam nonspam nonspam
[1289] spam nonspam nonspam spam spam nonspam nonspam nonspam
[1297] nonspam nonspam nonspam nonspam spam nonspam spam spam
[1305] nonspam spam nonspam nonspam nonspam spam nonspam nonspam
[1313] nonspam spam nonspam nonspam nonspam spam nonspam nonspam
[1321] spam nonspam nonspam nonspam nonspam nonspam spam spam
[1329] nonspam spam nonspam nonspam nonspam spam nonspam spam
[1337] nonspam nonspam nonspam nonspam nonspam nonspam spam spam
[1345] nonspam nonspam nonspam nonspam nonspam nonspam spam nonspam
[1353] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam
[1361] nonspam nonspam nonspam nonspam spam spam nonspam nonspam
[1369] nonspam nonspam nonspam nonspam spam spam spam spam
[1377] nonspam nonspam nonspam spam spam spam nonspam spam
[1385] spam nonspam nonspam spam spam spam spam spam
[1393] spam spam spam nonspam nonspam nonspam spam spam
[1401] spam nonspam nonspam nonspam nonspam nonspam nonspam spam
[1409] nonspam nonspam nonspam spam nonspam spam spam nonspam
[1417] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam
[1425] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam
[1433] spam nonspam spam nonspam nonspam spam spam spam
[1441] nonspam nonspam nonspam spam nonspam spam nonspam spam
[1449] spam nonspam nonspam nonspam nonspam nonspam spam spam
[1457] nonspam spam nonspam nonspam nonspam nonspam spam spam
[1465] nonspam nonspam nonspam spam spam nonspam nonspam nonspam
[1473] nonspam spam nonspam spam spam nonspam nonspam nonspam
[1481] nonspam spam nonspam spam nonspam spam nonspam nonspam
[1489] nonspam nonspam nonspam nonspam nonspam spam nonspam spam
[1497] spam nonspam spam spam spam nonspam spam nonspam
[1505] nonspam spam spam spam spam nonspam spam nonspam
[1513] nonspam nonspam nonspam spam nonspam nonspam nonspam nonspam
[1521] nonspam nonspam nonspam spam nonspam spam nonspam nonspam
[1529] nonspam spam nonspam nonspam nonspam spam nonspam nonspam
[1537] spam nonspam nonspam spam spam spam spam spam
[1545] spam spam spam spam spam spam spam spam
[1553] spam spam spam nonspam nonspam nonspam nonspam nonspam
[1561] nonspam spam spam spam nonspam spam spam nonspam
[1569] spam spam spam nonspam spam nonspam nonspam nonspam
[1577] spam nonspam spam nonspam spam nonspam spam nonspam
[1585] nonspam nonspam nonspam nonspam spam nonspam spam nonspam
[1593] spam spam spam spam spam spam nonspam spam
[1601] spam nonspam nonspam nonspam nonspam spam nonspam nonspam
[1609] nonspam nonspam nonspam nonspam spam nonspam nonspam nonspam
[1617] spam nonspam spam nonspam nonspam spam nonspam nonspam
[1625] spam spam nonspam nonspam spam spam spam spam
[1633] nonspam nonspam spam nonspam spam nonspam nonspam nonspam
[1641] nonspam spam nonspam spam nonspam nonspam spam spam
[1649] spam nonspam nonspam nonspam nonspam nonspam spam spam
[1657] spam nonspam nonspam nonspam spam nonspam nonspam nonspam
[1665] nonspam spam spam spam spam spam nonspam nonspam
[1673] spam nonspam nonspam spam spam nonspam nonspam nonspam
[1681] nonspam spam nonspam spam nonspam nonspam spam spam
[1689] nonspam nonspam spam spam nonspam nonspam spam nonspam
[1697] spam nonspam nonspam spam nonspam nonspam spam spam
[1705] nonspam nonspam spam nonspam spam spam nonspam spam
[1713] nonspam spam nonspam nonspam nonspam spam nonspam nonspam
[1721] spam nonspam nonspam spam spam nonspam nonspam spam
[1729] spam nonspam nonspam nonspam nonspam nonspam spam nonspam
[1737] nonspam spam nonspam spam nonspam nonspam nonspam nonspam
[1745] nonspam nonspam spam nonspam nonspam nonspam nonspam spam
[1753] nonspam spam spam nonspam nonspam nonspam nonspam nonspam
[1761] nonspam nonspam nonspam nonspam nonspam spam spam nonspam
[1769] spam spam spam spam spam spam spam spam
[1777] spam spam spam spam spam spam spam spam
[1785] spam spam spam spam spam spam spam spam
[1793] spam spam spam spam spam spam spam spam
[1801] spam spam spam spam spam spam spam spam
[1809] spam spam spam spam spam spam spam spam
[1817] spam spam spam spam spam spam spam spam
[1825] spam spam spam spam spam spam spam nonspam
[1833] nonspam spam nonspam nonspam spam nonspam nonspam spam
[1841] spam nonspam nonspam nonspam spam nonspam nonspam nonspam
[1849] nonspam nonspam spam spam nonspam nonspam nonspam nonspam
[1857] spam nonspam nonspam nonspam nonspam spam nonspam nonspam
[1865] nonspam nonspam nonspam spam spam spam nonspam nonspam
[1873] spam spam spam spam spam nonspam spam spam
[1881] nonspam spam nonspam spam spam nonspam nonspam nonspam
[1889] nonspam nonspam nonspam nonspam nonspam spam nonspam spam
[1897] spam spam spam spam nonspam spam nonspam nonspam
[1905] nonspam nonspam nonspam nonspam spam nonspam nonspam nonspam
[1913] spam spam nonspam nonspam spam nonspam nonspam spam
[1921] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam
[1929] nonspam nonspam nonspam nonspam spam nonspam nonspam nonspam
[1937] nonspam nonspam nonspam nonspam nonspam nonspam nonspam nonspam
[1945] spam nonspam nonspam spam nonspam nonspam spam nonspam
[1953] spam nonspam spam spam spam spam nonspam nonspam
[1961] nonspam nonspam nonspam spam nonspam nonspam nonspam nonspam
[1969] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam
[1977] spam nonspam spam nonspam spam nonspam nonspam nonspam
[1985] spam nonspam nonspam spam nonspam nonspam nonspam spam
[1993] spam nonspam nonspam nonspam nonspam spam spam nonspam
[2001] spam nonspam nonspam spam spam spam nonspam nonspam
[2009] spam nonspam spam nonspam nonspam nonspam nonspam spam
[2017] spam nonspam spam spam nonspam spam spam nonspam
[2025] nonspam spam nonspam nonspam spam nonspam spam nonspam
[2033] nonspam spam nonspam nonspam nonspam nonspam nonspam nonspam
[2041] nonspam nonspam nonspam spam spam spam nonspam spam
[2049] spam spam spam nonspam spam nonspam nonspam nonspam
[2057] nonspam nonspam nonspam nonspam spam spam spam nonspam
[2065] nonspam spam spam nonspam spam nonspam spam spam
[2073] spam spam spam spam spam spam spam spam
[2081] spam spam spam nonspam nonspam nonspam spam spam
[2089] spam spam nonspam spam spam spam spam nonspam
[2097] spam spam spam spam spam nonspam spam spam
[2105] spam spam spam nonspam spam spam spam nonspam
[2113] spam spam spam spam spam spam spam nonspam
[2121] spam nonspam spam spam nonspam spam spam nonspam
[2129] spam spam nonspam nonspam nonspam nonspam nonspam nonspam
[2137] spam spam nonspam spam nonspam spam nonspam spam
[2145] spam spam spam spam spam nonspam spam nonspam
[2153] spam spam spam spam spam spam nonspam nonspam
[2161] spam nonspam spam nonspam nonspam spam nonspam spam
[2169] nonspam nonspam spam spam spam nonspam spam nonspam
[2177] spam spam nonspam spam spam spam spam spam
[2185] spam spam nonspam spam spam spam spam spam
[2193] spam spam spam nonspam nonspam spam spam spam
[2201] spam spam nonspam nonspam spam spam nonspam spam
[2209] nonspam nonspam nonspam nonspam spam spam spam spam
[2217] spam spam spam spam spam spam spam spam
[2225] spam spam nonspam spam spam spam nonspam spam
[2233] nonspam nonspam spam spam nonspam nonspam nonspam spam
[2241] spam nonspam spam spam spam spam spam spam
[2249] spam nonspam nonspam spam nonspam spam spam nonspam
[2257] spam nonspam nonspam nonspam spam nonspam spam nonspam
[2265] nonspam nonspam spam spam nonspam nonspam spam nonspam
[2273] spam nonspam spam spam spam nonspam nonspam spam
[2281] nonspam spam nonspam spam spam nonspam spam spam
[2289] nonspam spam nonspam nonspam nonspam nonspam nonspam spam
[2297] spam spam nonspam spam spam
Levels: nonspam spam
> table(spam.test[,58],pr.model)
pr.model
nonspam spam
nonspam 778 634
spam 42 847
> result<-table(spam.test$type,pr.model)
> 2*result[2,2]/(2*result[2,2]+result[1,2]+result[2,1])
[1] 0.7147679
|
マルコフ連鎖モンテカルロ法(MCMC)
ギッブスサンプリング
ある正規母集団から15個の観測値yを得た場合のパラメータ(μ, σ2)の事後分布をギッブスサンプリングを用いて評価する。 この際の事前分布はμ〜N(μ0、σ20)、σ2〜IG(r0/2,s0/2)を仮定する。
> y<-c(1.53,19.02,5.34,-2.16,0.83,6.74,5.53,0.65,-0.49,-0.08,5.31,3.18,7.19,4.23,6.45)
> my<-mean(y)
> n<-length(y)
> iterations<-3500
> m0<-0
> s0<-100
> a0<-0.01
> b0<-0.01
> theta<-matrix(nrow=iterations,ncol=2)
> cur.mu<-0
> cur.tau<-1
> cur.s<-sqrt(1/cur.tau)
> for(t in 1:iterations){
+ w<-s0^2/(cur.s^2/n+s0^2)
+ m<-w*my+(1-w)*m0
+ s<-sqrt(w+cur.s^2/2)
+ cur.mu<-rnorm(1,m,s)
+ a<-a0+0.5*n
+ b<-b0+0.5*sum((y-cur.mu)^2)
+ cur.tau<-rgamma(1,a,b)
+ cur.s<-sqrt(1/cur.tau)
+ theta[t,]<-c(cur.mu,cur.s)}
> op<-par(mfrow=c(2,1))
> p=1:iterations
> plot(p,theta[,1],type="l",main="mu",xlab="iteration",ylab="mu",lty=1,lwd=1)
> plot(p,theta[,2],type="l",main="sigma",xlab="iteration",ylab="sigma",lty=1,lwd=1)
> par(op)
|
メトロポリス-ヘイスティングス
線形回帰モデル
正規線形回帰モデル:ギッブスサンプリング
> library(MCMCpack)
> regdata<-list(X=c(-2,-1,0,1,2,3),Y=c(1,3,3,3,5,6))
> posterior<-MCMCregress(Y~X,data=regdata,burnin=1000,mcmc=1000,b0=0.0,B0=0,c0=2,d0=0.001,verbose=1000)
MCMCregress iteration 1 of 2000
beta =
3.15047
0.87636
sigma2 = 0.18268
MCMCregress iteration 1001 of 2000
beta =
2.80864
0.85773
sigma2 = 0.12893
> plot(posterior)
> summary(posterior)
Iterations = 1001:2000
Thinning interval = 1
Number of chains = 1
Sample size per chain = 1000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
(Intercept) 3.0691 0.2536 0.008019 0.007150
X 0.8854 0.1483 0.004691 0.003937
sigma2 0.4067 0.3146 0.009950 0.012335
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
(Intercept) 2.5580 2.9199 3.0691 3.2250 3.587
X 0.5911 0.7966 0.8849 0.9771 1.184
sigma2 0.1197 0.2167 0.3189 0.4743 1.298
|
トービットモデル:ギッブスサンプリング
> library(survival)
> example(tobin)
tobin> tfit <- survreg(Surv(durable, durable>0, type='left') ~age + quant,
tobin+ data=tobin, dist='gaussian')
tobin> predict(tfit,type="response")
[1] -3.04968679 -4.31254182 -0.54163315 -0.25607164 -1.85017727 -2.40987803
[7] -3.50629220 -0.74041486 -4.05145594 -3.55880518 -0.32223237 -3.68044619
[13] -3.65997456 -2.63254564 0.22382063 0.02177674 -0.09571284 -3.17696755
[19] -0.61521215 -3.13913903
> summary(tfit)
Call:
survreg(formula = Surv(durable, durable > 0, type = "left") ~
age + quant, data = tobin, dist = "gaussian")
Value Std. Error z p
(Intercept) 15.1449 16.0795 0.942 3.46e-01
age -0.1291 0.2186 -0.590 5.55e-01
quant -0.0455 0.0583 -0.782 4.34e-01
Log(scale) 1.7179 0.3103 5.536 3.10e-08
Scale= 5.57
Gaussian distribution
Loglik(model)= -28.9 Loglik(intercept only)= -29.5
Chisq= 1.1 on 2 degrees of freedom, p= 0.58
Number of Newton-Raphson Iterations: 3
n= 20
> tfit.mcmc<-MCMCtobit(durable~age+quant,data=tobin,mcmc=20000,b0=0.0,B0=0,c0=0.001,d0=0.001,verbose=1000)
MCMCtobit iteration 1 of 21000
beta =
14.53324
-0.17875
-0.02518
sigma2 = 8.41402
MCMCtobit iteration 1001 of 21000
beta =
52.55459
-0.55878
-0.15851
sigma2 = 454.21928
MCMCtobit iteration 2001 of 21000
beta =
19.96620
-0.17549
-0.07358
sigma2 = 73.64337
MCMCtobit iteration 3001 of 21000
beta =
215.85382
-4.63364
-0.39967
sigma2 = 7795.87892
MCMCtobit iteration 4001 of 21000
beta =
36.92052
-1.40173
0.05722
sigma2 = 818.75771
MCMCtobit iteration 5001 of 21000
beta =
-20.32282
0.56543
-0.05620
sigma2 = 329.88211
MCMCtobit iteration 6001 of 21000
beta =
-19.99355
0.29027
-0.02940
sigma2 = 158.42768
MCMCtobit iteration 7001 of 21000
beta =
54.44763
-0.19122
-0.19175
sigma2 = 84.79789
MCMCtobit iteration 8001 of 21000
beta =
21.27023
-1.47644
0.15638
sigma2 = 209.03197
MCMCtobit iteration 9001 of 21000
beta =
19.55942
0.00574
-0.11454
sigma2 = 178.02732
MCMCtobit iteration 10001 of 21000
beta =
-23.13812
-0.37182
0.11873
sigma2 = 273.72350
MCMCtobit iteration 11001 of 21000
beta =
-36.96670
-0.38160
0.19984
sigma2 = 230.54203
MCMCtobit iteration 12001 of 21000
beta =
27.92896
-0.05697
-0.12680
sigma2 = 54.93730
MCMCtobit iteration 13001 of 21000
beta =
38.62670
-0.31182
-0.11833
sigma2 = 84.74766
MCMCtobit iteration 14001 of 21000
beta =
20.04418
-0.19912
-0.05552
sigma2 = 32.96554
MCMCtobit iteration 15001 of 21000
beta =
4.07612
0.30036
-0.10512
sigma2 = 199.13058
MCMCtobit iteration 16001 of 21000
beta =
9.04425
0.12401
-0.08732
sigma2 = 49.02667
MCMCtobit iteration 17001 of 21000
beta =
23.73888
-0.07277
-0.09530
sigma2 = 61.10130
MCMCtobit iteration 18001 of 21000
beta =
11.13250
-0.18867
-0.02441
sigma2 = 21.97008
MCMCtobit iteration 19001 of 21000
beta =
50.45571
-0.33907
-0.21748
sigma2 = 282.25100
MCMCtobit iteration 20001 of 21000
beta =
43.70535
-0.17534
-0.18929
sigma2 = 91.96757
> plot(tfit.mcmc)
> summary(tfit.mcmc)
Iterations = 1001:21000
Thinning interval = 1
Number of chains = 1
Sample size per chain = 20000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
(Intercept) 18.53980 39.2003 0.277188 0.499879
age -0.29145 0.5818 0.004114 0.009942
quant -0.04648 0.1422 0.001006 0.001427
sigma2 173.09722 361.0501 2.553010 12.107377
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
(Intercept) -55.1223 -0.5865 17.6714 36.12702 98.4133
age -1.6044 -0.5153 -0.2242 0.01572 0.6172
quant -0.3281 -0.1124 -0.0460 0.01921 0.2295
sigma2 19.6001 48.3753 86.6096 173.74384 823.2478
|
2項プロビットモデル:ギッブスサンプリング
> data(birthwt)
> posterior<-MCMCprobit(low~smoke,data=birthwt)
> plot(posterior)
> summary(posterior)
Iterations = 1001:11000
Thinning interval = 1
Number of chains = 1
Sample size per chain = 10000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
(Intercept) -0.6704 0.1257 0.001257 0.002232
smoke 0.4355 0.1941 0.001941 0.002763
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
(Intercept) -0.9186 -0.7555 -0.6710 -0.5856 -0.4275
smoke 0.0575 0.3056 0.4341 0.5665 0.8181
|
2項ロジットモデル:M-Hサンプリング
> data(birthwt)
> posterior<-MCMClogit(low~age+smoke,b0=0,B0=0.001,burnin=1000,mcmc=10000,data=birthwt)
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
The Metropolis acceptance rate for beta was 0.41109
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> plot(posterior)
> summary(posterior)
Iterations = 1001:11000
Thinning interval = 1
Number of chains = 1
Sample size per chain = 10000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
(Intercept) 0.09399 0.78373 0.0078373 0.024541
age -0.05192 0.03295 0.0003295 0.001068
smoke 0.69249 0.32523 0.0032523 0.010616
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
(Intercept) -1.47912 -0.42880 0.08399 0.62812 1.5886
age -0.11908 -0.07403 -0.05057 -0.03039 0.0123
smoke 0.05021 0.47021 0.68683 0.91203 1.3518
|
WinBUGSの導入
http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml
32Bit版のOSを使用している場合、Quick startからWinBugs14.exeをダウンロードし、インストール。
64Bit版の場合は、1.Zipファイルごとダウンロードし、展開したものをProgram Filesへ移動。
2.WinBugを起動し、[File]→[Open]を選択し、パッチファイルを開く。 [Tools]→[Decode]を選択し、[Decode ALL]を選択。
Back to R






