R

DPLYR를 이용한 Cars93처리 ②

한번해보즈아 2021. 4. 5. 12:37

Cars93 data

1. 조건에 맞는 데이터 찾기 (filter,slice 사용)

> table(df$Type)

Compact   Large Midsize   Small  Sporty     Van 
     16      11      22      21      14       9 
     
     
> filter(df,Type=="Large" , Price<=20 , MPG.city >=20) ## ,는 and 개념
  Manufacturer    Model  Type Min.Price Price Max.Price MPG.city MPG.highway            AirBags DriveTrain
1     Chrylser Concorde Large      18.4  18.4      18.4       20          28 Driver & Passenger      Front
2        Eagle   Vision Large      17.5  19.3      21.2       20          28 Driver & Passenger      Front
  Cylinders EngineSize Horsepower  RPM Rev.per.mile Man.trans.avail Fuel.tank.capacity Passengers Length
1         6        3.3        153 5300         1990              No                 18          6    203
2         6        3.5        214 5800         1980              No                 18          6    202
  Wheelbase Width Turn.circle Rear.seat.room Luggage.room Weight Origin              Make
1       113    74          40             31           15   3515    USA Chrylser Concorde
2       113    74          40             30           15   3490    USA      Eagle Vision


> filter(df,Type=="Large" | Price<=20 | MPG.city >=20) ## |는 or 개념
   Manufacturer          Model    Type Min.Price Price Max.Price MPG.city MPG.highway            AirBags
1         Acura        Integra   Small      12.9  15.9      18.8       25          31               None
2          Audi             90 Compact      25.9  29.1      32.3       20          26        Driver only
3           BMW           535i Midsize      23.7  30.0      36.2       22          30        Driver only
4         Buick        Century Midsize      14.2  15.7      17.3       22          31        Driver only
5         Buick        LeSabre   Large      19.9  20.8      21.7       19          28        Driver only
6         Buick     Roadmaster   Large      22.6  23.7      24.9       16          25        Driver only
7      Cadillac        DeVille   Large      33.0  34.7      36.3       16          25        Driver only
8     Chevrolet       Cavalier Compact       8.5  13.4      18.3       25          36               None
9     Chevrolet        Corsica Compact      11.4  11.4      11.4       25          34        Driver only
10    Chevrolet         Camaro  Sporty      13.4  15.1      16.8       19          28 Driver & Passenger
11    Chevrolet         Lumina Midsize      13.4  15.9      18.4       21          29               None
12    Chevrolet     Lumina_APV     Van      14.7  16.3      18.0       18          23               None
13    Chevrolet          Astro     Van      14.7  16.6      18.6       15          20               None
14    Chevrolet        Caprice   Large      18.0  18.8      19.6       17          26        Driver only
15     Chrylser       Concorde   Large      18.4  18.4      18.4       20          28 Driver & Passenger
16     Chrysler        LeBaron Compact      14.5  15.8      17.1       23          28 Driver & Passenger
17     Chrysler       Imperial   Large      29.5  29.5      29.5       20          26        Driver only
18        Dodge           Colt   Small       7.9   9.2      10.6       29          33               None
19        Dodge         Shadow   Small       8.4  11.3      14.2       23          29        Driver only
20        Dodge         Spirit Compact      11.9  13.3      14.7       22          27        Driver only
21        Dodge        Caravan     Van      13.6  19.0      24.4       17          21        Driver only
22        Dodge        Dynasty Midsize      14.8  15.6      16.4       21          27        Driver only
23        Eagle         Summit   Small       7.9  12.2      16.5       29          33               None
24        Eagle         Vision   Large      17.5  19.3      21.2       20          28 Driver & Passenger
25         Ford        Festiva   Small       6.9   7.4       7.9       31          33               None
26         Ford         Escort   Small       8.4  10.1      11.9       23          30               None
27         Ford          Tempo Compact      10.4  11.3      12.2       22          27               None
28         Ford        Mustang  Sporty      10.8  15.9      21.0       22          29        Driver only
29         Ford          Probe  Sporty      12.8  14.0      15.2       24          30        Driver only
30         Ford       Aerostar     Van      14.5  19.9      25.3       15          20        Driver only
31         Ford         Taurus Midsize      15.6  20.2      24.8       21          30        Driver only
32         Ford Crown_Victoria   Large      20.1  20.9      21.7       18          26        Driver only
33          Geo          Metro   Small       6.7   8.4      10.0       46          50               None
34          Geo          Storm  Sporty      11.5  12.5      13.5       30          36        Driver only
35        Honda        Prelude  Sporty      17.0  19.8      22.7       24          31 Driver & Passenger
36        Honda          Civic   Small       8.4  12.1      15.8       42          46        Driver only
37        Honda         Accord Compact      13.8  17.5      21.2       24          31 Driver & Passenger
   DriveTrain Cylinders EngineSize Horsepower  RPM Rev.per.mile Man.trans.avail Fuel.tank.capacity Passengers
1       Front         4        1.8        140 6300         2890             Yes               13.2          5
2       Front         6        2.8        172 5500         2280             Yes               16.9          5
3        Rear         4        3.5        208 5700         2545             Yes               21.1          4
4       Front         4        2.2        110 5200         2565              No               16.4          6
5       Front         6        3.8        170 4800         1570              No               18.0          6
6        Rear         6        5.7        180 4000         1320              No               23.0          6
7       Front         8        4.9        200 4100         1510              No               18.0          6
8       Front         4        2.2        110 5200         2380             Yes               15.2          5
9       Front         4        2.2        110 5200         2665             Yes               15.6          5
10       Rear         6        3.4        160 4600         1805             Yes               15.5          4
11      Front         4        2.2        110 5200         2595              No               16.5          6
12      Front         6        3.8        170 4800         1690              No               20.0          7
13        4WD         6        4.3        165 4000         1790              No               27.0          8
14       Rear         8        5.0        170 4200         1350              No               23.0          6
15      Front         6        3.3        153 5300         1990              No               18.0          6
16      Front         4        3.0        141 5000         2090              No               16.0          6
17      Front         6        3.3        147 4800         1785              No               16.0          6
18      Front         4        1.5         92 6000         3285             Yes               13.2          5
19      Front         4        2.2         93 4800         2595             Yes               14.0          5
20      Front         4        2.5        100 4800         2535             Yes               16.0          6
21        4WD         6        3.0        142 5000         1970              No               20.0          7
22      Front         4        2.5        100 4800         2465              No               16.0          6
23      Front         4        1.5         92 6000         2505             Yes               13.2          5
24      Front         6        3.5        214 5800         1980              No               18.0          6
25      Front         4        1.3         63 5000         3150             Yes               10.0          4
26      Front         4        1.8        127 6500         2410             Yes               13.2          5
27      Front         4        2.3         96 4200         2805             Yes               15.9          5
28       Rear         4        2.3        105 4600         2285             Yes               15.4          4
29      Front         4        2.0        115 5500         2340             Yes               15.5          4
30        4WD         6        3.0        145 4800         2080             Yes               21.0          7
31      Front         6        3.0        140 4800         1885              No               16.0          5
32       Rear         8        4.6        190 4200         1415              No               20.0          6
33      Front         3        1.0         55 5700         3755             Yes               10.6          4
34      Front         4        1.6         90 5400         3250             Yes               12.4          4
35      Front         4        2.3        160 5800         2855             Yes               15.9          4
36      Front         4        1.5        102 5900         2650             Yes               11.9          4
37      Front         4        2.2        140 5600         2610             Yes               17.0          4
   Length Wheelbase Width Turn.circle Rear.seat.room Luggage.room Weight  Origin                 Make
1     177       102    68          37           26.5           11   2705 non-USA        Acura Integra
2     180       102    67          37           28.0           14   3375 non-USA              Audi 90
3     186       109    69          39           27.0           13   3640 non-USA             BMW 535i
4     189       105    69          41           28.0           16   2880     USA        Buick Century
5     200       111    74          42           30.5           17   3470     USA        Buick LeSabre
6     216       116    78          45           30.5           21   4105     USA     Buick Roadmaster
7     206       114    73          43           35.0           18   3620     USA     Cadillac DeVille
8     182       101    66          38           25.0           13   2490     USA   Chevrolet Cavalier
9     184       103    68          39           26.0           14   2785     USA    Chevrolet Corsica
10    193       101    74          43           25.0           13   3240     USA     Chevrolet Camaro
11    198       108    71          40           28.5           16   3195     USA     Chevrolet Lumina
12    178       110    74          44           30.5           NA   3715     USA Chevrolet Lumina_APV
13    194       111    78          42           33.5           NA   4025     USA      Chevrolet Astro
14    214       116    77          42           29.5           20   3910     USA    Chevrolet Caprice
15    203       113    74          40           31.0           15   3515     USA    Chrylser Concorde
16    183       104    68          41           30.5           14   3085     USA     Chrysler LeBaron
17    203       110    69          44           36.0           17   3570     USA    Chrysler Imperial
18    174        98    66          32           26.5           11   2270     USA           Dodge Colt
19    172        97    67          38           26.5           13   2670     USA         Dodge Shadow
20    181       104    68          39           30.5           14   2970     USA         Dodge Spirit
21    175       112    72          42           26.5           NA   3705     USA        Dodge Caravan
22    192       105    69          42           30.5           16   3080     USA        Dodge Dynasty
23    174        98    66          36           26.5           11   2295     USA         Eagle Summit
24    202       113    74          40           30.0           15   3490     USA         Eagle Vision
25    141        90    63          33           26.0           12   1845     USA         Ford Festiva
26    171        98    67          36           28.0           12   2530     USA          Ford Escort
27    177       100    68          39           27.5           13   2690     USA           Ford Tempo
28    180       101    68          40           24.0           12   2850     USA         Ford Mustang
29    179       103    70          38           23.0           18   2710     USA           Ford Probe
30    176       119    72          45           30.0           NA   3735     USA        Ford Aerostar
31    192       106    71          40           27.5           18   3325     USA          Ford Taurus
32    212       114    78          43           30.0           21   3950     USA  Ford Crown_Victoria
33    151        93    63          34           27.5           10   1695 non-USA            Geo Metro
34    164        97    67          37           24.5           11   2475 non-USA            Geo Storm
35    175       100    70          39           23.5            8   2865 non-USA        Honda Prelude
36    173       103    67          36           28.0           12   2350 non-USA          Honda Civic
37    185       107    67          41           28.0           14   3040 non-USA         Honda Accord
 [ reached 'max' / getOption("max.print") -- omitted 41 rows ]


> slice(df,1:3) #원하는 행의 범위를 지정해서 추출해줌
  Manufacturer   Model    Type Min.Price Price Max.Price MPG.city MPG.highway            AirBags DriveTrain
1        Acura Integra   Small      12.9  15.9      18.8       25          31               None      Front
2        Acura  Legend Midsize      29.2  33.9      38.7       18          25 Driver & Passenger      Front
3         Audi      90 Compact      25.9  29.1      32.3       20          26        Driver only      Front
  Cylinders EngineSize Horsepower  RPM Rev.per.mile Man.trans.avail Fuel.tank.capacity Passengers Length
1         4        1.8        140 6300         2890             Yes               13.2          5    177
2         6        3.2        200 5500         2335             Yes               18.0          5    195
3         6        2.8        172 5500         2280             Yes               16.9          5    180
  Wheelbase Width Turn.circle Rear.seat.room Luggage.room Weight  Origin          Make
1       102    68          37           26.5           11   2705 non-USA Acura Integra
2       115    71          38           30.0           15   3560 non-USA  Acura Legend
3       102    67          37           28.0           14   3375 non-USA       Audi 90

 

 

2.특정 열 추출하기(select)

 

원래 따로 dplyr를 사용하지않은상태에서 특정열을 추출하는 방법은 다음과 같습니다.

> df[,c("Model","Type","Price")]  ##dplyr 사용x 기본사용o
            Model    Type Price
1         Integra   Small  15.9
2          Legend Midsize  33.9
3              90 Compact  29.1
4             100 Midsize  37.7
5            535i Midsize  30.0

 

하지만 dplyr의 select를 사용하면 좀더 편하게 컬럼을 추출할수있습니다.

 

> select(df,1:5)  #1열~5열 추출
    Manufacturer          Model    Type Min.Price Price
1          Acura        Integra   Small      12.9  15.9
2          Acura         Legend Midsize      29.2  33.9
3           Audi             90 Compact      25.9  29.1
4           Audi            100 Midsize      30.8  37.7
5            BMW           535i Midsize      23.7  30.0

> select(df,Type,Price,Model) #type, price,model열 추출 
      Type Price          Model
1    Small  15.9        Integra
2  Midsize  33.9         Legend
3  Compact  29.1             90
4  Midsize  37.7            100
5  Midsize  30.0           535i

> select(df,Model:Price) #model~price열 추출
            Model    Type Min.Price Price
1         Integra   Small      12.9  15.9
2          Legend Midsize      29.2  33.9
3              90 Compact      25.9  29.1
4             100 Midsize      30.8  37.7
5            535i Midsize      23.7  30.0

> select(df,-(Model:Price)) #model~price열 제외하고 추출
   Manufacturer Max.Price MPG.city MPG.highway            AirBags DriveTrain Cylinders EngineSize Horsepower  RPM Rev.per.mile Man.trans.avail Fuel.tank.capacity Passengers
1         Acura      18.8       25          31               None      Front         4        1.8        140 6300         2890             Yes               13.2          5
2         Acura      38.7       18          25 Driver & Passenger      Front         6        3.2        200 5500         2335             Yes               18.0          5
3          Audi      32.3       20          26        Driver only      Front         6        2.8        172 5500         2280             Yes               16.9          5
4          Audi      44.6       19          26 Driver & Passenger      Front         6        2.8        172 5500         2535             Yes               21.1          6
5           BMW      36.2       22          30        Driver only       Rear         4        3.5        208 5700         2545             Yes               21.1          4

> select(df,starts_with("M")) #M으로 시작하는 열 추출
    Manufacturer          Model Min.Price Max.Price MPG.city MPG.highway Man.trans.avail                     Make
1          Acura        Integra      12.9      18.8       25          31             Yes            Acura Integra
2          Acura         Legend      29.2      38.7       18          25             Yes             Acura Legend
3           Audi             90      25.9      32.3       20          26             Yes                  Audi 90
4           Audi            100      30.8      44.6       19          26             Yes                 Audi 100
5            BMW           535i      23.7      36.2       22          30             Yes                 BMW 535i

> select(df,ends_with("e")) #e로 끝나는 열 추출출
      Type Min.Price Price Max.Price EngineSize Rev.per.mile Wheelbase Turn.circle                     Make
1    Small      12.9  15.9      18.8        1.8         2890       102          37            Acura Integra
2  Midsize      29.2  33.9      38.7        3.2         2335       115          38             Acura Legend
3  Compact      25.9  29.1      32.3        2.8         2280       102          37                  Audi 90
4  Midsize      30.8  37.7      44.6        2.8         2535       106          37                 Audi 100
5  Midsize      23.7  30.0      36.2        3.5         2545       109          39                 BMW 535i

> select(df,contains("x")) #x를 포함하는 열 추출 대소문자 구분x
   Max.Price
1       18.8
2       38.7
3       32.3
4       44.6
5       36.2

> select(df,matches(".P.")) #문자열 가운데 P가 와야함
      Type Min.Price Max.Price MPG.city MPG.highway Horsepower  RPM Rev.per.mile Fuel.tank.capacity
1    Small      12.9      18.8       25          31        140 6300         2890               13.2
2  Midsize      29.2      38.7       18          25        200 5500         2335               18.0
3  Compact      25.9      32.3       20          26        172 5500         2280               16.9
4  Midsize      30.8      44.6       19          26        172 5500         2535               21.1
5  Midsize      23.7      36.2       22          30        208 5700         2545               21.1

> select(df,matches("P")) # 그냥 P가 들어가기만 하면 됨
      Type Min.Price Price Max.Price MPG.city MPG.highway Horsepower  RPM Rev.per.mile Fuel.tank.capacity Passengers
1    Small      12.9  15.9      18.8       25          31        140 6300         2890               13.2          5
2  Midsize      29.2  33.9      38.7       18          25        200 5500         2335               18.0          5
3  Compact      25.9  29.1      32.3       20          26        172 5500         2280               16.9          5
4  Midsize      30.8  37.7      44.6       19          26        172 5500         2535               21.1          6
5  Midsize      23.7  30.0      36.2       22          30        208 5700         2545               21.1          4

> select(df,one_of(c("Price","Type","chicken")))  #data에 있는 열만 추출함 chicken이라는 열은 없으므로 추출x
   Price    Type
1   15.9   Small
2   33.9 Midsize
3   29.1 Compact
4   37.7 Midsize
5   30.0 Midsize

 

여기서 주의할점이 MASS와 DPLYR에는 각각 select함수가 있어 이런식으로 에러가 뜨는데 이때는 MASS 패키지를 detach해주면 됩니다.

MASS selct

> select(df,1:5)
Error in select(df, 1:5) : 사용되지 않은 인자 (1:5)

## detach 사용
detach("package:MASS", unload = TRUE)

 

다음으로 사용할 함수를 이용하기 위하여 임의의 data를 생성하고 num_range와 rename을 사용해보겠습니다.

A1 <- c(rep(1:10))
A2 <- c(rep(11:20))
A3 <- c(rep(21:30))
A4 <- c(rep(31:40))
df_1 <- cbind(A1,A2,A3,A4)
df_1 <- data.frame(df_1)
select(df_1,num_range("A",2:3)) #A1,A2,A3,A4열중에서 A로 시작하는 A2,A3을 추출

> ## 이름 변경하기 
> names(df_1)
[1] "A1" "A2" "A3" "A4"
> df_1 <- rename(df_1,
+        one=A1,
+        two=A2,
+        three=A3,
+        four=A4)
> names(df_1)
[1] "one"   "two"   "three" "four" 

 

3. 오름차순, 내림차순 정렬(arrange)

 

> arrange(df, Price) # default값은 오름차순 
   Manufacturer    Model    Type Min.Price Price Max.Price MPG.city MPG.highway            AirBags DriveTrain Cylinders EngineSize Horsepower  RPM Rev.per.mile
1          Ford  Festiva   Small       6.9   7.4       7.9       31          33               None      Front         4        1.3         63 5000         3150
2       Hyundai    Excel   Small       6.8   8.0       9.2       29          33               None      Front         4        1.5         81 5500         2710
3         Mazda      323   Small       7.4   8.3       9.1       29          37               None      Front         4        1.6         82 5000         2370
4           Geo    Metro   Small       6.7   8.4      10.0       46          50               None      Front         3        1.0         55 5700         3755
5        Subaru    Justy   Small       7.3   8.4       9.5       33          37               None        4WD         3        1.2         73 5600         2875

> arrange(df, Price, desc(Max.Price)) # 첫번째로 price로 오름차순정렬하고 2번째로 Max.price로 내림차순정렬
   Manufacturer    Model    Type Min.Price Price Max.Price MPG.city MPG.highway            AirBags DriveTrain Cylinders EngineSize Horsepower  RPM Rev.per.mile
1          Ford  Festiva   Small       6.9   7.4       7.9       31          33               None      Front         4        1.3         63 5000         3150
2       Hyundai    Excel   Small       6.8   8.0       9.2       29          33               None      Front         4        1.5         81 5500         2710
3         Mazda      323   Small       7.4   8.3       9.1       29          37               None      Front         4        1.6         82 5000         2370
4           Geo    Metro   Small       6.7   8.4      10.0       46          50               None      Front         3        1.0         55 5700         3755
5        Subaru    Justy   Small       7.3   8.4       9.5       33          37               None        4WD         3        1.2         73 5600         2875