(1) Creating a .csv file in Excel
Read in the data
z <- read.table("Data.csv", header = TRUE, sep=",", stringsAsFactors=FALSE)
Check the data for errors
str()
str(z)
## 'data.frame': 120 obs. of 10 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Sample : chr "ITS10" "ITS100" "ITS102" "ITS104" ...
## $ Plot : int 5 10 1 2 3 4 5 6 7 8 ...
## $ WarmingTreatment: chr "control" "warmed" "control" "warmed" ...
## $ HostSpecies : chr "AL" "FT" "PP" "PP" ...
## $ TissueType : chr "Lvs" "Lvs" "Lvs" "Lvs" ...
## $ CollectionDate : chr "September" "June" "June" "June" ...
## $ DivShann : num 2.56 1.79 3.47 2.25 3.36 ...
## $ Richness : int 29 160 152 165 229 175 47 9 24 6 ...
## $ Evenness : num 0.76 0.354 0.691 0.44 0.619 ...
summary()
summary(z)
## ID Sample Plot WarmingTreatment
## Min. : 1.00 Length:120 Min. : 1.0 Length:120
## 1st Qu.: 30.75 Class :character 1st Qu.: 3.0 Class :character
## Median : 60.50 Mode :character Median : 5.5 Mode :character
## Mean : 60.50 Mean : 5.5
## 3rd Qu.: 90.25 3rd Qu.: 8.0
## Max. :120.00 Max. :10.0
##
## HostSpecies TissueType CollectionDate DivShann
## Length:120 Length:120 Length:120 Min. :0.2679
## Class :character Class :character Class :character 1st Qu.:2.1592
## Mode :character Mode :character Mode :character Median :2.7622
## Mean :2.6448
## 3rd Qu.:3.2020
## Max. :4.4158
##
## Richness Evenness
## Min. : 6.0 Min. :0.3091
## 1st Qu.: 36.5 1st Qu.:0.5719
## Median : 97.5 Median :0.6664
## Mean :101.7 Mean :0.6676
## 3rd Qu.:153.5 3rd Qu.:0.7673
## Max. :296.0 Max. :0.9094
## NA's :60 NA's :60
table()
table(z$Sample)
##
## ITS1 ITS10 ITS100 ITS101 ITS102 ITS103 ITS104 ITS105 ITS106 ITS107
## 1 1 1 1 1 1 1 1 1 1
## ITS108 ITS109 ITS11 ITS110 ITS111 ITS112 ITS113 ITS114 ITS115 ITS116
## 1 1 1 1 1 1 1 1 1 1
## ITS117 ITS118 ITS119 ITS12 ITS120 ITS13 ITS14 ITS15 ITS16 ITS17
## 1 1 1 1 1 1 1 1 1 1
## ITS18 ITS19 ITS2 ITS20 ITS21 ITS22 ITS23 ITS24 ITS25 ITS26
## 1 1 1 1 1 1 1 1 1 1
## ITS27 ITS28 ITS29 ITS3 ITS30 ITS31 ITS32 ITS33 ITS34 ITS35
## 1 1 1 1 1 1 1 1 1 1
## ITS36 ITS37 ITS38 ITS39 ITS4 ITS40 ITS41 ITS42 ITS43 ITS44
## 1 1 1 1 1 1 1 1 1 1
## ITS45 ITS46 ITS47 ITS48 ITS49 ITS5 ITS50 ITS51 ITS52 ITS53
## 1 1 1 1 1 1 1 1 1 1
## ITS54 ITS55 ITS56 ITS57 ITS58 ITS59 ITS6 ITS60 ITS61 ITS62
## 1 1 1 1 1 1 1 1 1 1
## ITS63 ITS64 ITS65 ITS66 ITS67 ITS68 ITS69 ITS7 ITS70 ITS71
## 1 1 1 1 1 1 1 1 1 1
## ITS72 ITS73 ITS74 ITS75 ITS76 ITS77 ITS78 ITS79 ITS8 ITS80
## 1 1 1 1 1 1 1 1 1 1
## ITS81 ITS82 ITS83 ITS84 ITS85 ITS86 ITS87 ITS88 ITS89 ITS9
## 1 1 1 1 1 1 1 1 1 1
## ITS90 ITS91 ITS92 ITS93 ITS94 ITS95 ITS96 ITS97 ITS98 ITS99
## 1 1 1 1 1 1 1 1 1 1
table(z$Plot)
##
## 1 2 3 4 5 6 7 8 9 10
## 12 12 12 12 12 12 12 12 12 12
table(z$WarmingTreatment)
##
## control warmed
## 60 60
table(z$HostSpecies)
##
## AL FT PP
## 40 40 40
table(z$TissueType)
##
## Lvs Rts
## 60 60
table(z$CollectionDate)
##
## June September
## 60 60
head()
head(z)
## ID Sample Plot WarmingTreatment HostSpecies TissueType CollectionDate
## 1 1 ITS10 5 control AL Lvs September
## 2 2 ITS100 10 warmed FT Lvs June
## 3 3 ITS102 1 control PP Lvs June
## 4 4 ITS104 2 warmed PP Lvs June
## 5 5 ITS106 3 control PP Lvs June
## 6 6 ITS108 4 warmed PP Lvs June
## DivShann Richness Evenness
## 1 2.557716 29 0.7595756
## 2 1.794332 160 0.3535508
## 3 3.470708 152 0.6908420
## 4 2.246294 165 0.4399369
## 5 3.364005 229 0.6190977
## 6 3.172398 175 0.6142361
tail()
tail(z)
## ID Sample Plot WarmingTreatment HostSpecies TissueType CollectionDate
## 115 115 ITS9 5 control AL Rts September
## 116 116 ITS91 6 warmed FT Rts June
## 117 117 ITS93 7 control FT Rts June
## 118 118 ITS95 8 warmed FT Rts June
## 119 119 ITS97 9 control FT Rts June
## 120 120 ITS99 10 warmed FT Rts June
## DivShann Richness Evenness
## 115 3.092677 NA NA
## 116 3.348968 NA NA
## 117 2.033130 NA NA
## 118 2.911970 NA NA
## 119 2.710575 NA NA
## 120 2.362769 NA NA
(2) Working with Regular Expressions
Question 1
I took this:
First String Second 1.22 3.4
Second More Text 1.555555 2.2220
Third x 3 124
And transformed it to this:
First String,Second,1.22,3.4
Second,More Text,1.555555,2.2220
Third,x,3,124
By searching for this:
\s{2,}
And replacing it with this:
,
Question 2
I took this:
Ballif, Bryan, University of Vermont
Ellison, Aaron, Harvard Forest
Record, Sydne, Bryn Mawr
And transformed it to this:
Bryan Ballif (University of Vermont)
Aaron Ellison (Harvard Forest)
Sydne Record (Bryn Mawr)
By searching for this:
(\w+), (\w+), (.+)
And replacing it with this:
\2 \1 (\3)
Question 3
Part 1
I took this:
0001 Georgia Horseshoe.mp3 0002 Billy In The Lowground.mp3 0003 Cherokee Shuffle.mp3 0004 Walking Cane.mp3
And transformed it to this:
0001 Georgia Horseshoe.mp3
0002 Billy In The Lowground.mp3
0003 Cherokee Shuffle.mp3
0004 Walking Cane.mp3
By searching for this:
mp3
And replacing it with this:
mp3 \n
Part 2
I took this:
0001 Georgia Horseshoe.mp3
0002 Billy In The Lowground.mp3
0003 Cherokee Shuffle.mp3
0004 Walking Cane.mp3
And transformed it to this:
Georgia Horseshoe_0001.mp3
Billy In The Lowground_0002.mp3
Cherokee Shuffle_0003.mp3
Walking Cane_0004.mp3
By searching for this:
(\d+) (.+)[.mp3]{4}
And replacing it with this:
\2_\1.mp3
Question 4
Part 1
I took this:
Camponotus,pennsylvanicus,10.2,44
Camponotus,herculeanus,10.5,3
Myrmica,punctiventris,12.2,4
Lasius,neoniger,3.3,55
And transformed it to this:
C_pennsylvanicus,44
C_herculeanus,3
M_punctiventris,4
L_neoniger,55
By searching for this:
(\w)(\w+),(\w+),(.+),(.+)
And replacing it with this:
\1_\3,\5
Part 2
I took this:
Camponotus,pennsylvanicus,10.2,44
Camponotus,herculeanus,10.5,3
Myrmica,punctiventris,12.2,4
Lasius,neoniger,3.3,55
And transformed it to this:
C_penn,44
C_herc,3
M_punc,4
L_neon,55
By searching for this:
(\w)(\w+),(\w{4})(\w+),(.+),(.+)
And replacing it with this:
\1_\3,\6