dummiesパッケージ
"dummies"
ダミー変数を作るためのパッケージ.
endogenous の sampselect が factor に対応していないので導入.
サンプルコード参照
> letters <- c( "a", "a", "b", "c", "d", "e", "f", "g", "h", "b", "b" )
> dummy( as.character(letters) )
as.character(letters)a as.character(letters)b as.character(letters)c
[1,] 1 0 0
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 0 1
[5,] 0 0 0
[6,] 0 0 0
[7,] 0 0 0
[8,] 0 0 0
[9,] 0 0 0
[10,] 0 1 0
[11,] 0 1 0
as.character(letters)d as.character(letters)e as.character(letters)f
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
[4,] 0 0 0
[5,] 1 0 0
[6,] 0 1 0
[7,] 0 0 1
[8,] 0 0 0
[9,] 0 0 0
[10,] 0 0 0
[11,] 0 0 0
as.character(letters)g as.character(letters)h
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 0 0
[5,] 0 0
[6,] 0 0
[7,] 0 0
[8,] 1 0
[9,] 0 1
[10,] 0 0
[11,] 0 0
> dummy( as.character(letters) )
as.character(letters)a as.character(letters)b as.character(letters)c
[1,] 1 0 0
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 0 1
[5,] 0 0 0
[6,] 0 0 0
[7,] 0 0 0
[8,] 0 0 0
[9,] 0 0 0
[10,] 0 1 0
[11,] 0 1 0
as.character(letters)d as.character(letters)e as.character(letters)f
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
[4,] 0 0 0
[5,] 1 0 0
[6,] 0 1 0
[7,] 0 0 1
[8,] 0 0 0
[9,] 0 0 0
[10,] 0 0 0
[11,] 0 0 0
as.character(letters)g as.character(letters)h
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 0 0
[5,] 0 0
[6,] 0 0
[7,] 0 0
[8,] 1 0
[9,] 0 1
[10,] 0 0
[11,] 0 0
ベクトルもしくはデータフレームに対して0,1のダミー変数を作ってくれる.
以下,オプションについて.
> l <- as.factor(letters)[ c(1:3,1:6,4:6) ]
> l
[1] a a b a a b c d e c d e
Levels: a b c d e f g h
> l
[1] a a b a a b c d e c d e
Levels: a b c d e f g h
普通に作るとa~eのみでダミーが作られる↓
> dummy(l)
la lb lc ld le
[1,] 1 0 0 0 0
[2,] 1 0 0 0 0
[3,] 0 1 0 0 0 ...
la lb lc ld le
[1,] 1 0 0 0 0
[2,] 1 0 0 0 0
[3,] 0 1 0 0 0 ...
drop = FALSE を入れると,未使用のダミーも残したまま作ってくれる.
(基本的には入れとくべきな気がする)
l の要素は a~e であるが作られたダミーは↓
> dummy(l, drop=FALSE)
la lb lc ld le lf lg lh
[1,] 1 0 0 0 0 0 0 0
[2,] 1 0 0 0 0 0 0 0
[3,] 0 1 0 0 0 0 0 0
[4,] 1 0 0 0 0 0 0 0
[5,] 1 0 0 0 0 0 0 0 ...
la lb lc ld le lf lg lh
[1,] 1 0 0 0 0 0 0 0
[2,] 1 0 0 0 0 0 0 0
[3,] 0 1 0 0 0 0 0 0
[4,] 1 0 0 0 0 0 0 0
[5,] 1 0 0 0 0 0 0 0 ...
sep : ダミーの名前付け.
> dummy(l, sep=":")
l:a l:b l:c l:d l:e
[1,] 1 0 0 0 0
[2,] 1 0 0 0 0 ...
[1,] 1 0 0 0 0
[2,] 1 0 0 0 0 ...
fun:ダミーの型の決定.デフォルトは as.integer
> dummy(l, sep="::", fun=as.logical)
l::a l::b l::c l::d l::e
[1,] TRUE FALSE FALSE FALSE FALSE
[2,] TRUE FALSE FALSE FALSE FALSE ...
l::a l::b l::c l::d l::e
[1,] TRUE FALSE FALSE FALSE FALSE
[2,] TRUE FALSE FALSE FALSE FALSE ...