如何debugging“对比度只能应用于2级以上的因素”的错误?
以下是我正在使用的所有variables:
str(ad.train) $ Date : Factor w/ 427 levels "2012-03-24","2012-03-29",..: 4 7 12 14 19 21 24 29 31 34 ... $ Team : Factor w/ 18 levels "Adelaide","Brisbane Lions",..: 1 1 1 1 1 1 1 1 1 1 ... $ Season : int 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ... $ Round : Factor w/ 28 levels "EF","GF","PF",..: 5 16 21 22 23 24 25 26 27 6 ... $ Score : int 137 82 84 96 110 99 122 124 49 111 ... $ Margin : int 69 18 -56 46 19 5 50 69 -26 29 ... $ WinLoss : Factor w/ 2 levels "0","1": 2 2 1 2 2 2 2 2 1 2 ... $ Opposition : Factor w/ 18 levels "Adelaide","Brisbane Lions",..: 8 18 10 9 13 16 7 3 4 6 ... $ Venue : Factor w/ 19 levels "Adelaide Oval",..: 4 7 10 7 7 13 7 6 7 15 ... $ Disposals : int 406 360 304 370 359 362 365 345 324 351 ... $ Kicks : int 252 215 170 225 221 218 224 230 205 215 ... $ Marks : int 109 102 52 41 95 78 93 110 69 85 ... $ Handballs : int 154 145 134 145 138 144 141 115 119 136 ... $ Goals : int 19 11 12 13 16 15 19 19 6 17 ... $ Behinds : int 19 14 9 16 11 6 7 9 12 6 ... $ Hitouts : int 42 41 34 47 45 70 48 54 46 34 ... $ Tackles : int 73 53 51 76 65 63 65 67 77 58 ... $ Rebound50s : int 28 34 23 24 32 48 39 31 34 29 ... $ Inside50s : int 73 49 49 56 61 45 47 50 49 48 ... $ Clearances : int 39 33 38 52 37 43 43 48 37 52 ... $ Clangers : int 47 38 44 62 49 46 32 24 31 41 ... $ FreesFor : int 15 14 15 18 17 15 19 14 18 20 ... $ ContendedPossessions: int 152 141 149 192 138 164 148 151 160 155 ... $ ContestedMarks : int 10 16 11 3 12 12 17 14 15 11 ... $ MarksInside50 : int 16 13 10 8 12 9 14 13 6 12 ... $ OnePercenters : int 42 54 30 58 24 56 32 53 50 57 ... $ Bounces : int 1 6 4 4 1 7 11 14 0 4 ... $ GoalAssists : int 15 6 9 10 9 12 13 14 5 14 ...
这是我想要适应的glm:
ad.glm.all <- glm(WinLoss ~ factor(Team) + Season + Round + Score + Margin + Opposition + Venue + Disposals + Kicks + Marks + Handballs + Goals + Behinds + Hitouts + Tackles + Rebound50s + Inside50s+ Clearances+ Clangers+ FreesFor + ContendedPossessions + ContestedMarks + MarksInside50 + OnePercenters + Bounces+GoalAssists, data = ad.train, family = binomial(logit))
我知道这是很多变数(计划是通过正向variablesselect来减less)。 但是即使知道它们是int或Factor的很多variables, 据我了解的事情应该只是一个glm工作。 但是,每次我尝试适应这个模型,我得到:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels
对我而言,哪种看起来好像R因为某些原因而不把我的因子variables当作因子variables呢?
甚至像这样简单的事情:
ad.glm.test <- glm(WinLoss ~ factor(Team), data = ad.train, family = binomial(logit))
不工作! (同样的错误信息)
在这里:
ad.glm.test <- glm(WinLoss ~ Clearances, data = ad.train, family = binomial(logit))
将工作!
任何人都知道这里发生了什么? 为什么我不能把这些因子variables适合我的glm?
提前致谢!
-Troy
请使用以下方式进行全面检查:
## remove incomplete cases dat <- na.omit(ad.train) ## extract factor columns and drop redundant levels fctr <- lapply(dat[sapply(dat, is.factor)], droplevels) ## count levels sapply(fctr, nlevels)