Preparing the tourism data for plotting. I worked with Carson Klemmer.
I downloaded the average hourly earnings of male and female employees in 2016 data from Our world in Data. I selected this data, because I plan on traveling after I graduate from Sonoma State.
This is the link to the data.
The following code chunk loads the packages I will use to record in and prepare the data for analysis.
glimpse(tourist_arrivals_by_region)
Rows: 205
Columns: 4
$ Entity <chr> "Africa", "Africa", "Africa",…
$ Code <lgl> NA, NA, NA, NA, NA, NA, NA, N…
$ Year <int> 1950, 1960, 1965, 1970, 1975,…
$ International.Tourist.Arrivals <int> 500000, 800000, 1400000, 2400…
#View(tourist_arrivals_by_region)
Create the regions
that is a list of regions I want to extract from the data set.
Change the name of the first column to regions
.
Use filter
to extract the rows that I want to keep. Year >= 2000 and Entity in regions
.
Select the columns to keep: regions
, year
, and International Tourist Arrivals
.
Assign the output to regional_tourism
.
Display the first 10 rows of regional_tourism
.
regions <-c("Africa",
"Middle East",
"Asia & Pacific",
"Americas",
"Europe")
regional_tourism <- tourist_arrivals_by_region %>%
rename(Region = 1) %>%
filter(Year >=2000, Region %in% regions) %>%
select(Region, Year, International.Tourist.Arrivals)
regional_tourism
Region Year International.Tourist.Arrivals
1 Africa 2000 27900000
2 Africa 2001 29100000
3 Africa 2002 30000000
4 Africa 2003 31600000
5 Africa 2004 34500000
6 Africa 2005 37300000
7 Africa 2006 41400000
8 Africa 2007 44300000
9 Africa 2008 44400000
10 Africa 2009 45900000
11 Africa 2010 50400000
12 Africa 2014 55200000
13 Africa 2015 53800000
14 Africa 2016 58200000
15 Africa 2017 63000000
16 Africa 2018 67000000
17 Americas 2000 128200000
18 Americas 2001 122100000
19 Americas 2002 116700000
20 Americas 2003 113100000
21 Americas 2004 125700000
22 Americas 2005 133500000
23 Americas 2006 135800000
24 Americas 2007 142500000
25 Americas 2008 147800000
26 Americas 2009 141700000
27 Americas 2010 150100000
28 Americas 2014 181900000
29 Americas 2015 192700000
30 Americas 2016 200900000
31 Americas 2017 207000000
32 Americas 2018 217000000
33 Asia & Pacific 2000 110600000
34 Asia & Pacific 2001 115700000
35 Asia & Pacific 2002 124900000
36 Asia & Pacific 2003 113300000
37 Asia & Pacific 2004 144200000
38 Asia & Pacific 2005 155400000
39 Asia & Pacific 2006 166800000
40 Asia & Pacific 2007 184200000
41 Asia & Pacific 2008 184100000
42 Asia & Pacific 2009 181100000
43 Asia & Pacific 2010 205500000
44 Asia & Pacific 2014 264400000
45 Asia & Pacific 2015 279300000
46 Asia & Pacific 2016 302900000
47 Asia & Pacific 2017 323000000
48 Asia & Pacific 2018 343000000
49 Europe 2000 391000000
50 Europe 2001 395200000
51 Europe 2002 407000000
52 Europe 2003 407100000
53 Europe 2004 424400000
54 Europe 2005 441500000
55 Europe 2006 462100000
56 Europe 2007 484900000
57 Europe 2008 485200000
58 Europe 2009 461700000
59 Europe 2010 489400000
60 Europe 2014 580200000
61 Europe 2015 607500000
62 Europe 2016 619700000
63 Europe 2017 671000000
64 Europe 2018 713000000
65 Middle East 2000 24400000
66 Middle East 2001 24500000
67 Middle East 2002 28500000
68 Middle East 2003 29500000
69 Middle East 2004 36300000
70 Middle East 2005 39000000
71 Middle East 2006 41400000
72 Middle East 2007 47400000
73 Middle East 2008 55200000
74 Middle East 2009 52800000
75 Middle East 2010 55400000
76 Middle East 2014 55400000
77 Middle East 2015 55900000
78 Middle East 2016 53600000
79 Middle East 2017 58000000
80 Middle East 2018 64000000
Check that the totals for 2000 equals the total in the graph
regional_tourism %>% filter(Year == 2000) %>%
summarize(total_arrivals = sum(International.Tourist.Arrivals))
total_arrivals
1 682100000
Add a picture
Write the data to file in project directory
write_csv(regional_tourism, file = "international-tourist-arrivals-by-world-region.csv")