-
Notifications
You must be signed in to change notification settings - Fork 16
/
ggplot2-Tutorial-With-R.html
444 lines (409 loc) · 45.5 KB
/
ggplot2-Tutorial-With-R.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
<!DOCTYPE html>
<html>
<head>
<title>How to make any plot in ggplot2? | ggplot2 Tutorial</title>
<meta charset="utf-8">
<meta name="Description" content="R Language Tutorials for Advanced Statistics">
<meta name="Keywords" content="R, Tutorial, Machine learning, Statistics, Data Mining, Analytics, Data science, Linear Regression, Logistic Regression, Time series, Forecasting">
<meta name="Distribution" content="Global">
<meta name="Author" content="Selva Prabhakaran">
<meta name="Robots" content="index, follow">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="shortcut icon" href="/screenshots/iconb-64.png" type="image/x-icon" />
<link href="www/bootstrap.min.css" rel="stylesheet">
<link href="www/highlight.css" rel="stylesheet">
<link href='http://fonts.googleapis.com/css?family=Inconsolata:400,700'
rel='stylesheet' type='text/css'>
<!-- Color Script -->
<style type="text/css">
a {
color: #3675C5;
color: rgb(25, 145, 248);
color: #4582ec;
color: #3F73D8;
}
li {
line-height: 1.65;
}
/* reduce spacing around math formula*/
.MathJax_Display {
margin: 0em 0em;
}
</style>
<!-- Add Google search -->
<script language="Javascript" type="text/javascript">
function my_search_google()
{
var query = document.getElementById("my-google-search").value;
window.open("http://google.com/search?q=" + query
+ "%20site:" + "http://r-statistics.co");
}
</script>
</head>
<body>
<div class="container">
<div class="masthead">
<!--
<ul class="nav nav-pills pull-right">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">
Table of contents<b class="caret"></b>
</a>
<ul class="dropdown-menu pull-right" role="menu">
<li class="dropdown-header"></li>
<li class="dropdown-header">Tutorial</li>
<li><a href="R-Tutorial.html">R Tutorial</a></li>
<li class="dropdown-header">ggplot2</li>
<li><a href="ggplot2-Tutorial-With-R.html">ggplot2 Short Tutorial</a></li>
<li><a href="Complete-Ggplot2-Tutorial-Part1-With-R-Code.html">ggplot2 Tutorial 1 - Intro</a></li>
<li><a href="Complete-Ggplot2-Tutorial-Part2-Customizing-Theme-With-R-Code.html">ggplot2 Tutorial 2 - Theme</a></li>
<li><a href="Top50-Ggplot2-Visualizations-MasterList-R-Code.html">ggplot2 Tutorial 3 - Masterlist</a></li>
<li><a href="ggplot2-cheatsheet.html">ggplot2 Quickref</a></li>
<li class="dropdown-header">Foundations</li>
<li><a href="Linear-Regression.html">Linear Regression</a></li>
<li><a href="Statistical-Tests-in-R.html">Statistical Tests</a></li>
<li><a href="Missing-Value-Treatment-With-R.html">Missing Value Treatment</a></li>
<li><a href="Outlier-Treatment-With-R.html">Outlier Analysis</a></li>
<li><a href="Variable-Selection-and-Importance-With-R.html">Feature Selection</a></li>
<li><a href="Model-Selection-in-R.html">Model Selection</a></li>
<li><a href="Logistic-Regression-With-R.html">Logistic Regression</a></li>
<li><a href="Environments.html">Advanced Linear Regression</a></li>
<li class="dropdown-header">Advanced Regression Models</li>
<li><a href="adv-regression-models.html">Advanced Regression Models</a></li>
<li class="dropdown-header">Time Series</li>
<li><a href="Time-Series-Analysis-With-R.html">Time Series Analysis</a></li>
<li><a href="Time-Series-Forecasting-With-R.html">Time Series Forecasting </a></li>
<li><a href="Time-Series-Forecasting-With-R-part2.html">More Time Series Forecasting</a></li>
<li class="dropdown-header">High Performance Computing</li>
<li><a href="Parallel-Computing-With-R.html">Parallel computing</a></li>
<li><a href="Strategies-To-Improve-And-Speedup-R-Code.html">Strategies to Speedup R code</a></li>
<li class="dropdown-header">Useful Techniques</li>
<li><a href="Association-Mining-With-R.html">Association Mining</a></li>
<li><a href="Multi-Dimensional-Scaling-With-R.html">Multi Dimensional Scaling</a></li>
<li><a href="Profiling.html">Optimization</a></li>
<li><a href="Information-Value-With-R.html">InformationValue package</a></li>
</ul>
</li>
</ul>
-->
<ul class="nav nav-pills pull-right">
<div class="input-group">
<form onsubmit="my_search_google()">
<input type="text" class="form-control" id="my-google-search" placeholder="Search..">
<form>
</div><!-- /input-group -->
</ul><!-- /.col-lg-6 -->
<h3 class="muted"><a href="/">r-statistics.co</a><small> by Selva Prabhakaran</small></h3>
<hr>
</div>
<div class="row">
<div class="col-xs-12 col-sm-3" id="nav">
<div class="well">
<li>
<ul class="list-unstyled">
<li class="dropdown-header"></li>
<li class="dropdown-header">Tutorial</li>
<li><a href="R-Tutorial.html">R Tutorial</a></li>
<li class="dropdown-header">ggplot2</li>
<li><a href="ggplot2-Tutorial-With-R.html">ggplot2 Short Tutorial</a></li>
<li><a href="Complete-Ggplot2-Tutorial-Part1-With-R-Code.html">ggplot2 Tutorial 1 - Intro</a></li>
<li><a href="Complete-Ggplot2-Tutorial-Part2-Customizing-Theme-With-R-Code.html">ggplot2 Tutorial 2 - Theme</a></li>
<li><a href="Top50-Ggplot2-Visualizations-MasterList-R-Code.html">ggplot2 Tutorial 3 - Masterlist</a></li>
<li><a href="ggplot2-cheatsheet.html">ggplot2 Quickref</a></li>
<li class="dropdown-header">Foundations</li>
<li><a href="Linear-Regression.html">Linear Regression</a></li>
<li><a href="Statistical-Tests-in-R.html">Statistical Tests</a></li>
<li><a href="Missing-Value-Treatment-With-R.html">Missing Value Treatment</a></li>
<li><a href="Outlier-Treatment-With-R.html">Outlier Analysis</a></li>
<li><a href="Variable-Selection-and-Importance-With-R.html">Feature Selection</a></li>
<li><a href="Model-Selection-in-R.html">Model Selection</a></li>
<li><a href="Logistic-Regression-With-R.html">Logistic Regression</a></li>
<li><a href="Environments.html">Advanced Linear Regression</a></li>
<li class="dropdown-header">Advanced Regression Models</li>
<li><a href="adv-regression-models.html">Advanced Regression Models</a></li>
<li class="dropdown-header">Time Series</li>
<li><a href="Time-Series-Analysis-With-R.html">Time Series Analysis</a></li>
<li><a href="Time-Series-Forecasting-With-R.html">Time Series Forecasting </a></li>
<li><a href="Time-Series-Forecasting-With-R-part2.html">More Time Series Forecasting</a></li>
<li class="dropdown-header">High Performance Computing</li>
<li><a href="Parallel-Computing-With-R.html">Parallel computing</a></li>
<li><a href="Strategies-To-Improve-And-Speedup-R-Code.html">Strategies to Speedup R code</a></li>
<li class="dropdown-header">Useful Techniques</li>
<li><a href="Association-Mining-With-R.html">Association Mining</a></li>
<li><a href="Multi-Dimensional-Scaling-With-R.html">Multi Dimensional Scaling</a></li>
<li><a href="Profiling.html">Optimization</a></li>
<li><a href="Information-Value-With-R.html">InformationValue package</a></li>
</ul>
</li>
</div>
<div class="well">
<p>Stay up-to-date. <a href="https://docs.google.com/forms/d/1xkMYkLNFU9U39Dd8S_2JC0p8B5t6_Yq6zUQjanQQJpY/viewform">Subscribe!</a></p>
<p><a href="https://docs.google.com/forms/d/13GrkCFcNa-TOIllQghsz2SIEbc-YqY9eJX02B19l5Ow/viewform">Chat!</a></p>
</div>
<h4>Contents</h4>
<ul class="list-unstyled" id="toc"></ul>
<!--
<hr>
<p><a href="/contribute.html">How to contribute</a></p>
<p><a class="btn btn-primary" href="">Edit this page</a></p>
-->
</div>
<div id="content" class="col-xs-12 col-sm-8 pull-right">
<h1>How to make any plot in ggplot2?</h1>
<blockquote>
<p>ggplot2 is the most elegant and aesthetically pleasing graphics framework available in R. It has a nicely planned structure to it. This tutorial focusses on exposing this underlying structure you can use to make any ggplot. But, the way you make plots in ggplot2 is very different from base graphics making the learning curve steep. So leave what you know about base graphics behind and follow along. You are just 5 steps away from cracking the ggplot puzzle.</p>
</blockquote>
<h2>Topics</h2>
<ol style="list-style-type: decimal">
<li><a href="ggplot2-Tutorial-With-R.html#1.%20The%20Setup">The Setup</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#2.%20The%20Layers">The Layers</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#3.%20The%20Labels">The Labels</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#4.%20The%20Theme">The Theme</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#5.%20The%20Facets">The Facets</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.%20Commonly%20Used%20Features">Commonly Used Features</a></li>
</ol>
<h5>Quick Ref</h5>
<ul>
<li><a href="ggplot2-Tutorial-With-R.html#6.1%20Make%20a%20time%20series%20plot%20(using%20ggfortify)">Make a time series plot (using ggfortify)</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.2%20Plot%20multiple%20timeseries%20on%20same%20ggplot">Plot multiple timeseries</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.3%20Bar%20charts">Make bar charts</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.4%20Custom%20layout">Custom layout</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.5%20Flipping%20coordinates">Flipping coordinates</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.6%20Adjust%20X%20and%20Y%20axis%20limits">Adjust X and Y axis limits</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.7%20Equal%20coordinates">Equal coordinates</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.8%20Change%20themes">Change themes</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.9%20Legend%20-%20Deleting%20and%20Changing%20Position">Legend</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.10%20Grid%20lines">Grid lines and panel background</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.11%20Plot%20margin%20and%20background">Plot margin and background</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.12%20Annotation">Annotation</a></li>
<li><a href="ggplot2-Tutorial-With-R.html#6.13%20Saving%20ggplot">Saving ggplot</a></li>
</ul>
<p><strong>Cheatsheets:</strong> Lookup code to accomplish common tasks from this <a href="ggplot2-cheatsheet.html">ggplot2 quickref</a> and <a href="http://www.rstudio.com/wp-content/uploads/2015/12/ggplot2-cheatsheet-2.0.pdf">this cheatsheet</a></p>
<p>The distinctive feature of the ggplot2 framework is the way you make plots through adding ‘layers’. The process of making any ggplot is as follows.</p>
<h2>1. The Setup</h2>
<p>First, you need to tell ggplot what dataset to use. This is done using the <code>ggplot(df)</code> function, where <code>df</code> is a dataframe that contains all features needed to make the plot. This is the most basic step. Unlike base graphics, ggplot doesn’t take vectors as arguments.</p>
<p>Optionally you can add whatever aesthetics you want to apply to your ggplot (inside <code>aes()</code> argument) - such as X and Y axis by specifying the respective variables from the dataset. The variable based on which the color, size, shape and stroke should change can also be specified here itself. The aesthetics specified here will be inherited by all the geom layers you will add subsequently.</p>
<p>If you intend to add more layers later on, may be a bar chart on top of a line graph, you can specify the respective aesthetics when you add those layers.</p>
<p>Below, I show few examples of how to setup ggplot using in the <code>diamonds</code> dataset that comes with <code>ggplot2</code> itself. However, no plot will be printed until you add the geom layers.</p>
<h5>Examples:</h5>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(ggplot2)
<span class="kw">ggplot</span>(diamonds) <span class="co"># if only the dataset is known.</span>
<span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat)) <span class="co"># if only X-axis is known. The Y-axis can be specified in respective geoms.</span>
<span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price)) <span class="co"># if both X and Y axes are fixed for all layers.</span>
<span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">color=</span>cut)) <span class="co"># Each category of the 'cut' variable will now have a distinct color, once a geom is added.</span></code></pre></div>
<p>The <code>aes</code> argument stands for aesthetics. ggplot2 considers the X and Y axis of the plot to be aesthetics as well, along with color, size, shape, fill etc. If you want to have the color, size etc fixed (i.e. not vary based on a variable from the dataframe), you need to specify it outside the <code>aes()</code>, like this.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat), <span class="dt">color=</span><span class="st">"steelblue"</span>)</code></pre></div>
<p>See this <a href="http://research.stowers-institute.org/efg/R/Color/Chart/">color palette</a> for more colors.</p>
<h2>2. The Layers</h2>
<p>The layers in ggplot2 are also called ‘<em>geoms</em>’. Once the base setup is done, you can append the geoms one on top of the other. The <a href="http://docs.ggplot2.org/current/">documentation</a> provides a compehensive list of all available <em>geoms</em>.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">geom_smooth</span>() <span class="co"># Adding scatterplot geom (layer1) and smoothing geom (layer2).</span></code></pre></div>
<p>We have added two layers (geoms) to this plot - the <code>geom_point()</code> and <code>geom_smooth()</code>. Since the X axis Y axis and the color were defined in <code>ggplot()</code> setup itself, these two layers inherited those aesthetics. Alternatively, you can specify those aesthetics inside the geom layer also as shown below.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(diamonds) +<span class="st"> </span><span class="kw">geom_point</span>(<span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_smooth</span>(<span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) <span class="co"># Same as above but specifying the aesthetics inside the geoms.</span></code></pre></div>
<p><img src='screenshots/ggplot_1.png' width='698' height='349' /></p>
<p>Notice the X and Y axis and how the color of the points vary based on the value of <code>cut</code> variable. The legend was automatically added. I would like to propose a change though. Instead of having multiple smoothing lines for each level of <code>cut</code>, I want to integrate them all under one line. How to do that? Removing the <code>color</code> aesthetic from <code>geom_smooth()</code> layer would accomplish that.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(ggplot2)
<span class="kw">ggplot</span>(diamonds) +<span class="st"> </span><span class="kw">geom_point</span>(<span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_smooth</span>(<span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price)) <span class="co"># Remove color from geom_smooth</span>
<span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price)) +<span class="st"> </span><span class="kw">geom_point</span>(<span class="kw">aes</span>(<span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_smooth</span>() <span class="co"># same but simpler</span></code></pre></div>
<p><img src='screenshots/ggplot_1_single_smooth.png' width='698' height='349' /></p>
<p>Here is a quick challenge for you. Can you make the shape of the points vary with <code>color</code> feature?</p>
<p>Though setting up took us quite a bit of code, adding further complexity such as the layers, distinct color for each cut etc was easy. Imagine how much code you would have had to write if you were to make this in base graphics? Thanks to ggplot2!</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># Answer to the challenge.</span>
<span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut, <span class="dt">shape=</span>color)) +<span class="st"> </span><span class="kw">geom_point</span>()</code></pre></div>
<h2>3. The Labels</h2>
<p>Now that you have drawn the main parts of the graph. You might want to add the plot’s main title and perhaps change the X and Y axis titles. This can be accomplished using the <code>labs</code> layer, meant for specifying the labels. However, manipulating the size, color of the labels is the job of the <a href="http://r-statistics.co/ggplot2-Tutorial-With-R.html#4.%20The%20Theme">‘Theme’</a>.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(ggplot2)
gg <-<span class="st"> </span><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"Scatterplot"</span>, <span class="dt">x=</span><span class="st">"Carat"</span>, <span class="dt">y=</span><span class="st">"Price"</span>) <span class="co"># add axis lables and plot title.</span>
<span class="kw">print</span>(gg)</code></pre></div>
<p><img src='screenshots/ggplot_2.png' width='698' height='349' /></p>
<p>The plot’s main title is added and the X and Y axis labels capitalized.</p>
<p><strong>Note</strong>: If you are showing a ggplot inside a function, you need to explicitly save it and then print using the <code>print(gg)</code>, like we just did above.</p>
<h2>4. The Theme</h2>
<p>Almost everything is set, except that we want to increase the size of the labels and change the legend title. Adjusting the size of labels can be done using the <code>theme()</code> function by setting the <code>plot.title</code>, <code>axis.text.x</code> and <code>axis.text.y</code>. They need to be specified inside the <code>element_text()</code>. If you want to remove any of them, set it to <code>element_blank()</code> and it will vanish entirely.</p>
<p>Adjusting the legend title is a bit tricky. If your legend is that of a <code>color</code> attribute and it varies based in a factor, you need to set the <code>name</code> using <code>scale_color_discrete()</code>, where the <em>color</em> part belongs to the color attribute and the <em>discrete</em> because the legend is based on a factor variable.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">gg1 <-<span class="st"> </span>gg +<span class="st"> </span><span class="kw">theme</span>(<span class="dt">plot.title=</span><span class="kw">element_text</span>(<span class="dt">size=</span><span class="dv">30</span>, <span class="dt">face=</span><span class="st">"bold"</span>),
<span class="dt">axis.text.x=</span><span class="kw">element_text</span>(<span class="dt">size=</span><span class="dv">15</span>),
<span class="dt">axis.text.y=</span><span class="kw">element_text</span>(<span class="dt">size=</span><span class="dv">15</span>),
<span class="dt">axis.title.x=</span><span class="kw">element_text</span>(<span class="dt">size=</span><span class="dv">25</span>),
<span class="dt">axis.title.y=</span><span class="kw">element_text</span>(<span class="dt">size=</span><span class="dv">25</span>)) +<span class="st"> </span>
<span class="st"> </span><span class="kw">scale_color_discrete</span>(<span class="dt">name=</span><span class="st">"Cut of diamonds"</span>) <span class="co"># add title and axis text, change legend title.</span>
<span class="kw">print</span>(gg1) <span class="co"># print the plot</span></code></pre></div>
<p><img src='screenshots/ggplot_3.png' width='698' height='349' /></p>
<p>If the legend shows a shape attribute based on a factor variable, you need to change it using <code>scale_shape_discrete(name="legend title")</code>. Had it been a continuous variable, use <code>scale_shape_continuous(name="legend title")</code> instead.</p>
<p>So now, Can you guess the function to use if your legend is based on a <code>fill</code> attribute on a continuous variable?</p>
<p>The answer is <code>scale_fill_continuous(name="legend title")</code>.</p>
<h2>5. The Facets</h2>
<p>In the previous chart, you had the scatterplot for all different values of <code>cut</code> plotted in the same chart. What if you want one chart for one <code>cut</code>?</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">gg1 +<span class="st"> </span><span class="kw">facet_wrap</span>( ~<span class="st"> </span>cut, <span class="dt">ncol=</span><span class="dv">3</span>) <span class="co"># columns defined by 'cut'</span></code></pre></div>
<p><img src='screenshots/ggplot_4.png' width='698' height='349' /></p>
<p><code>facet_wrap(formula)</code> takes in a formula as the argument. The item on the RHS corresponds to the column. The item on the LHS defines the rows.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">gg1 +<span class="st"> </span><span class="kw">facet_wrap</span>(color ~<span class="st"> </span>cut) <span class="co"># row: color, column: cut</span></code></pre></div>
<p><img src='screenshots/ggplot_5.png' width='698' height='349' /></p>
<p>In <code>facet_wrap</code>, the scales of the X and Y axis are fixed to accomodate all points by default. This would make comparison of attributes meaningful because they would be in the same scale. However, it is possible to make the scales roam free making the charts look more evenly distributed by setting the argument <code>scales=free</code>.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">gg1 +<span class="st"> </span><span class="kw">facet_wrap</span>(color ~<span class="st"> </span>cut, <span class="dt">scales=</span><span class="st">"free"</span>) <span class="co"># row: color, column: cut</span></code></pre></div>
<p>For comparison purposes, you can put all the plots in a grid as well using <code>facet_grid(formula)</code>.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">gg1 +<span class="st"> </span><span class="kw">facet_grid</span>(color ~<span class="st"> </span>cut) <span class="co"># In a grid</span></code></pre></div>
<p><img src='screenshots/ggplot_6.png' width='698' height='349' /></p>
<p>Note, the headers for individual plots are gone leaving more space for plotting area..</p>
<h2>6. Commonly Used Features</h2>
<h4>6.1 Make a time series plot (using ggfortify)</h4>
<p>The <code>ggfortify</code> package makes it very easy to plot time series directly from a time series object, without having to convert it to a dataframe. The example below plots the <code>AirPassengers</code> timeseries in one step. Cool!. See more <a href="http://rpubs.com/sinhrks/plot_ts">ggfortify’s autoplot options to plot time series here</a>.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(ggfortify)
<span class="kw">autoplot</span>(AirPassengers) +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"AirPassengers"</span>) <span class="co"># where AirPassengers is a 'ts' object</span></code></pre></div>
<p><img src='screenshots/ggplot_autoplot.png' width='698' height='349' /></p>
<h4>6.2 Plot multiple timeseries on same ggplot</h4>
<p>Plotting multiple timeseries requires that you have your data in dataframe format, in which one of the columns is the dates that will be used for X-axis.</p>
<p>Approach 1: After converting, you just need to keep adding multiple layers of time series one on top of the other.</p>
<p>Approach 2: Melt the dataframe using <code>reshape2::melt</code> by setting the <code>id</code> to the date field. Then just add one <code>geom_line</code> and set the color aesthetic to <code>variable</code> (which was created during the melt).</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># Approach 1:</span>
<span class="kw">data</span>(economics, <span class="dt">package=</span><span class="st">"ggplot2"</span>) <span class="co"># init data</span>
economics <-<span class="st"> </span><span class="kw">data.frame</span>(economics) <span class="co"># convert to dataframe</span>
<span class="kw">ggplot</span>(economics) +<span class="st"> </span><span class="kw">geom_line</span>(<span class="kw">aes</span>(<span class="dt">x=</span>date, <span class="dt">y=</span>pce, <span class="dt">color=</span><span class="st">"pcs"</span>)) +<span class="st"> </span><span class="kw">geom_line</span>(<span class="kw">aes</span>(<span class="dt">x=</span>date, <span class="dt">y=</span>unemploy, <span class="dt">col=</span><span class="st">"unemploy"</span>)) +<span class="st"> </span><span class="kw">scale_color_discrete</span>(<span class="dt">name=</span><span class="st">"Legend"</span>) +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"Economics"</span>) <span class="co"># plot multiple time series using 'geom_line's</span></code></pre></div>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="co"># Approach 2:</span>
<span class="kw">library</span>(reshape2)
df <-<span class="st"> </span><span class="kw">melt</span>(economics[, <span class="kw">c</span>(<span class="st">"date"</span>, <span class="st">"pce"</span>, <span class="st">"unemploy"</span>)], <span class="dt">id=</span><span class="st">"date"</span>)
<span class="kw">ggplot</span>(df) +<span class="st"> </span><span class="kw">geom_line</span>(<span class="kw">aes</span>(<span class="dt">x=</span>date, <span class="dt">y=</span>value, <span class="dt">color=</span>variable)) +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"Economics"</span>)<span class="co"># plot multiple time series by melting</span></code></pre></div>
<p><img src='screenshots/ggplot_multiple_ts1.png' width='698' height='349' /></p>
<p>The disadvantage with ggplot2 is that it is not possible to get multiple Y-axis on the same plot. To plot multiple time series on the same scale can make few of the series appear small. An alternative would be to <code>facet_wrap</code> it and set the <code>scales='free'</code>.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">df <-<span class="st"> </span><span class="kw">melt</span>(economics[, <span class="kw">c</span>(<span class="st">"date"</span>, <span class="st">"pce"</span>, <span class="st">"unemploy"</span>, <span class="st">"psavert"</span>)], <span class="dt">id=</span><span class="st">"date"</span>)
<span class="kw">ggplot</span>(df) +<span class="st"> </span><span class="kw">geom_line</span>(<span class="kw">aes</span>(<span class="dt">x=</span>date, <span class="dt">y=</span>value, <span class="dt">color=</span>variable)) +<span class="st"> </span><span class="kw">facet_wrap</span>( ~<span class="st"> </span>variable, <span class="dt">scales=</span><span class="st">"free"</span>)</code></pre></div>
<p><img src='screenshots/ggplot_multiplets_2.png' width='698' height='262' /></p>
<h4>6.3 Bar charts</h4>
<p>By default, ggplot makes a ‘counts’ barchart, meaning, it counts the frequency of items specified by the <code>x</code> aesthetic and plots it. In this format, you don’t need to specify the Y aesthetic. However, if you would like the make a bar chart of the absolute number, given by Y aesthetic, you need to set <code>stat="identity"</code> inside the <code>geom_bar</code>.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">plot1 <-<span class="st"> </span><span class="kw">ggplot</span>(mtcars, <span class="kw">aes</span>(<span class="dt">x=</span>cyl)) +<span class="st"> </span><span class="kw">geom_bar</span>() +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"Frequency bar chart"</span>) <span class="co"># Y axis derived from counts of X item</span>
<span class="kw">print</span>(plot1)</code></pre></div>
<p><img src='screenshots/ggplot_bar1.png' width='698' height='349' /></p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">df <-<span class="st"> </span><span class="kw">data.frame</span>(<span class="dt">var=</span><span class="kw">c</span>(<span class="st">"a"</span>, <span class="st">"b"</span>, <span class="st">"c"</span>), <span class="dt">nums=</span><span class="kw">c</span>(<span class="dv">1</span>:<span class="dv">3</span>))
plot2 <-<span class="st"> </span><span class="kw">ggplot</span>(df, <span class="kw">aes</span>(<span class="dt">x=</span>var, <span class="dt">y=</span>nums)) +<span class="st"> </span><span class="kw">geom_bar</span>(<span class="dt">stat =</span> <span class="st">"identity"</span>) <span class="co"># Y axis is explicit. 'stat=identity'</span>
<span class="kw">print</span>(plot2)</code></pre></div>
<p><img src='screenshots/ggplot_bar2.png' width='698' height='349' /></p>
<h4>6.4 Custom layout</h4>
<p>The <code>gridExtra</code> package provides the facility to arrage multiple ggplots in a single grid.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(gridExtra)
<span class="kw">grid.arrange</span>(plot1, plot2, <span class="dt">ncol=</span><span class="dv">2</span>)</code></pre></div>
<p><img src='screenshots/ggplot_arrange.png' width='698' height='349' /></p>
<h4>6.5 Flipping coordinates</h4>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">df <-<span class="st"> </span><span class="kw">data.frame</span>(<span class="dt">var=</span><span class="kw">c</span>(<span class="st">"a"</span>, <span class="st">"b"</span>, <span class="st">"c"</span>), <span class="dt">nums=</span><span class="kw">c</span>(<span class="dv">1</span>:<span class="dv">3</span>))
<span class="kw">ggplot</span>(df, <span class="kw">aes</span>(<span class="dt">x=</span>var, <span class="dt">y=</span>nums)) +<span class="st"> </span><span class="kw">geom_bar</span>(<span class="dt">stat =</span> <span class="st">"identity"</span>) +<span class="st"> </span><span class="kw">coord_flip</span>() +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"Coordinates are flipped"</span>)</code></pre></div>
<p><img src='screenshots/ggplot_flipped.png' width='698' height='349' /></p>
<h4>6.6 Adjust X and Y axis limits</h4>
<p>There are 3 ways to change the X and Y axis limits.</p>
<ol style="list-style-type: decimal">
<li>Using coord_cartesian(xlim=c(x1,x2))</li>
<li>Using xlim(c(x1,x2))</li>
<li>Using scale_x_continuous(limits=c(x1,x2))</li>
</ol>
<p><strong>Warning:</strong> Items 2 and 3 will delete the datapoints that lie outisde the limit from the data itself. So, if you add any smoothing line line and such, the outcome will be distorted. Item 1 (coord_cartesian) does not delete any datapoint, but instead zooms in to a specific region of the chart.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">geom_smooth</span>() +<span class="st"> </span><span class="kw">coord_cartesian</span>(<span class="dt">ylim=</span><span class="kw">c</span>(<span class="dv">0</span>, <span class="dv">10000</span>)) +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"Coord_cartesian zoomed in!"</span>)</code></pre></div>
<p><img src='screenshots/ggplot_ylim1.png' width='698' height='349' /></p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">geom_smooth</span>() +<span class="st"> </span><span class="kw">ylim</span>(<span class="kw">c</span>(<span class="dv">0</span>, <span class="dv">10000</span>)) +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"Datapoints deleted: Note the change in smoothing lines!"</span>)
<span class="co">#> Warning messages:</span>
<span class="co">#> 1: Removed 5222 rows containing non-finite values</span>
<span class="co">#> (stat_smooth). </span>
<span class="co">#> 2: Removed 5222 rows containing missing values</span>
<span class="co">#> (geom_point).</span></code></pre></div>
<p><img src='screenshots/ggplot_ylim2.png' width='698' height='349' /></p>
<h4>6.7 Equal coordinates</h4>
<p>Adding <code>coord_equal()</code> to ggplot sets the limits of X and Y axis to be equal. Below is a meaningless example. So to save face for not giving a good example, I am not showing you the output.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>price, <span class="dt">y=</span>price+<span class="kw">runif</span>(<span class="kw">nrow</span>(diamonds), <span class="dv">100</span>, <span class="dv">10000</span>), <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">geom_smooth</span>() +<span class="st"> </span><span class="kw">coord_equal</span>()</code></pre></div>
<h4>6.8 Change themes</h4>
<p>Apart from the basic ggplot2 theme, you can change the look and feel of your plots using one of these builtin themes.</p>
<ol style="list-style-type: decimal">
<li>theme_gray()</li>
<li>theme_bw()</li>
<li>theme_linedraw()</li>
<li>theme_light()</li>
<li>theme_minimal()</li>
<li>theme_classic()</li>
<li>theme_void()</li>
</ol>
<p>The <code>ggthemes</code> package provides <a href="https://github.com/jrnold/ggthemes">additional ggplot themes</a> that imitates famous magazines and softwares. Here is an example of how to change the theme.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">geom_smooth</span>() +<span class="kw">theme_bw</span>() +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"bw Theme"</span>)</code></pre></div>
<p><img src='screenshots/ggplot_bwtheme.png' width='698' height='349' /></p>
<h4>6.9 Legend - Deleting and Changing Position</h4>
<p>By setting <code>theme(legend.position="none")</code>, you can remove the legend. By setting it to ‘top’, i.e. <code>theme(legend.position="top")</code>, you can move the legend around the plot. By setting <code>legend.postion</code> to a co-ordinate inside the plot you can place the legend inside the plot itself. The <code>legend.justification</code> denotes the anchor point of the legend, i.e. the point that will be placed on the co-ordinates given by <code>legend.position</code>.</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">p1 <-<span class="st"> </span><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">geom_smooth</span>() +<span class="st"> </span><span class="kw">theme</span>(<span class="dt">legend.position=</span><span class="st">"none"</span>) +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"legend.position='none'"</span>) <span class="co"># remove legend</span>
p2 <-<span class="st"> </span><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">geom_smooth</span>() +<span class="st"> </span><span class="kw">theme</span>(<span class="dt">legend.position=</span><span class="st">"top"</span>) +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"legend.position='top'"</span>) <span class="co"># legend at top</span>
p3 <-<span class="st"> </span><span class="kw">ggplot</span>(diamonds, <span class="kw">aes</span>(<span class="dt">x=</span>carat, <span class="dt">y=</span>price, <span class="dt">color=</span>cut)) +<span class="st"> </span><span class="kw">geom_point</span>() +<span class="st"> </span><span class="kw">geom_smooth</span>() +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"legend.position='coords inside plot'"</span>) +<span class="st"> </span><span class="kw">theme</span>(<span class="dt">legend.justification=</span><span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">0</span>), <span class="dt">legend.position=</span><span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">0</span>)) <span class="co"># legend inside the plot.</span>
<span class="kw">grid.arrange</span>(p1, p2, p3, <span class="dt">ncol=</span><span class="dv">3</span>) <span class="co"># arrange</span></code></pre></div>
<p><img src='screenshots/ggplot_legend.png' width='698' height='349' /></p>
<h4>6.10 Grid lines</h4>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(mtcars, <span class="kw">aes</span>(<span class="dt">x=</span>cyl)) +<span class="st"> </span><span class="kw">geom_bar</span>(<span class="dt">fill=</span><span class="st">'darkgoldenrod2'</span>) +
<span class="st"> </span><span class="kw">theme</span>(<span class="dt">panel.background =</span> <span class="kw">element_rect</span>(<span class="dt">fill =</span> <span class="st">'steelblue'</span>),
<span class="dt">panel.grid.major =</span> <span class="kw">element_line</span>(<span class="dt">colour =</span> <span class="st">"firebrick"</span>, <span class="dt">size=</span><span class="dv">3</span>),
<span class="dt">panel.grid.minor =</span> <span class="kw">element_line</span>(<span class="dt">colour =</span> <span class="st">"blue"</span>, <span class="dt">size=</span><span class="dv">1</span>))</code></pre></div>
<p><img src='screenshots/ggplot_grid.png' width='698' height='349' /></p>
<h4>6.11 Plot margin and background</h4>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">ggplot</span>(mtcars, <span class="kw">aes</span>(<span class="dt">x=</span>cyl)) +<span class="st"> </span><span class="kw">geom_bar</span>(<span class="dt">fill=</span><span class="st">"firebrick"</span>) +<span class="st"> </span><span class="kw">theme</span>(<span class="dt">plot.background=</span><span class="kw">element_rect</span>(<span class="dt">fill=</span><span class="st">"steelblue"</span>), <span class="dt">plot.margin =</span> <span class="kw">unit</span>(<span class="kw">c</span>(<span class="dv">2</span>, <span class="dv">4</span>, <span class="dv">1</span>, <span class="dv">3</span>), <span class="st">"cm"</span>)) <span class="co"># top, right, bottom, left</span></code></pre></div>
<p><img src='screenshots/ggplot_margin.png' width='698' height='349' /></p>
<h4>6.12 Annotation</h4>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(grid)
my_grob =<span class="st"> </span><span class="kw">grobTree</span>(<span class="kw">textGrob</span>(<span class="st">"This text is at x=0.1 and y=0.9, relative!</span><span class="ch">\n</span><span class="st"> Anchor point is at 0,0"</span>, <span class="dt">x=</span><span class="fl">0.1</span>, <span class="dt">y=</span><span class="fl">0.9</span>, <span class="dt">hjust=</span><span class="dv">0</span>,
<span class="dt">gp=</span><span class="kw">gpar</span>(<span class="dt">col=</span><span class="st">"firebrick"</span>, <span class="dt">fontsize=</span><span class="dv">25</span>, <span class="dt">fontface=</span><span class="st">"bold"</span>)))
<span class="kw">ggplot</span>(mtcars, <span class="kw">aes</span>(<span class="dt">x=</span>cyl)) +<span class="st"> </span><span class="kw">geom_bar</span>() +<span class="st"> </span><span class="kw">annotation_custom</span>(my_grob) +<span class="st"> </span><span class="kw">labs</span>(<span class="dt">title=</span><span class="st">"Annotation Example"</span>)</code></pre></div>
<p><img src='screenshots/ggplot_annotation.png' width='698' height='349' /></p>
<h4>6.13 Saving ggplot</h4>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">plot1 <-<span class="st"> </span><span class="kw">ggplot</span>(mtcars, <span class="kw">aes</span>(<span class="dt">x=</span>cyl)) +<span class="st"> </span><span class="kw">geom_bar</span>()
<span class="kw">ggsave</span>(<span class="st">"myggplot.png"</span>) <span class="co"># saves the last plot.</span>
<span class="kw">ggsave</span>(<span class="st">"myggplot.png"</span>, <span class="dt">plot=</span>plot1) <span class="co"># save a stored ggplot</span></code></pre></div>
<p>For a more comprehensive list, the <a href="Top50-Ggplot2-Visualizations-MasterList-R-Code.html">top 50 Ggplot2 visualizations</a> provides some advanced Ggplot2 charts and helps to choose the right type for your specific objectives.</p>
<p>For a quicker reference, refer to this <a href="ggplot2-cheatsheet.html">ggplot2 cheatsheet</a>.</p>
<hr />
<p>Have a suggestion or found a bug? Notify <a href="https://docs.google.com/forms/d/e/1FAIpQLSeIJmlvwe562R7JVpi5J2ydLyhk5-7OrGRMFGYrMJvjPal8eA/viewform">here.</a></p>
</div>
</div>
<div class="footer">
<hr>
<p>© 2016-17 Selva Prabhakaran. Powered by <a href="http://jekyllrb.com/">jekyll</a>,
<a href="http://yihui.name/knitr/">knitr</a>, and
<a href="http://johnmacfarlane.net/pandoc/">pandoc</a>.
This work is licensed under the <a href="http://creativecommons.org/licenses/by-nc/3.0/">Creative Commons License.</a>
</p>
</div>
</div> <!-- /container -->
<script src="//code.jquery.com/jquery.js"></script>
<script src="www/bootstrap.min.js"></script>
<script src="www/toc.js"></script>
<!-- MathJax Script -->
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<script type="text/javascript"
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>
<!-- Google Analytics Code -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-69351797-1', 'auto');
ga('send', 'pageview');
</script>
<style type="text/css">
/* reduce spacing around math formula*/
.MathJax_Display {
margin: 0em 0em;
}
body {
font-family: 'Helvetica Neue', Roboto, Arial, sans-serif;
font-size: 16px;
line-height: 27px;
font-weight: 400;
}
blockquote p {
line-height: 1.75;
color: #717171;
}
.well li{
line-height: 28px;
}
li.dropdown-header {
display: block;
padding: 0px;
font-size: 14px;
}
</style>
</body>
</html>