diff --git a/.nojekyll b/.nojekyll index bdd080be2..891c331bc 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -dfddee1e \ No newline at end of file +f5fd3962 \ No newline at end of file diff --git a/EDA.html b/EDA.html index 8b9e7f43e..bcd99de3f 100644 --- a/EDA.html +++ b/EDA.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -426,7 +426,7 @@ </li> <li><a href="#patterns-and-models" id="toc-patterns-and-models" class="nav-link" data-scroll-target="#patterns-and-models"><span class="header-section-number">10.6</span> Patterns and models</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">10.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/EDA.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/EDA.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/arrow.html b/arrow.html index 0516141b4..405d5c54d 100644 --- a/arrow.html +++ b/arrow.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -420,7 +420,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">22.6</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/arrow.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/arrow.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/base-R.html b/base-R.html index b01fffe91..5fe7e9df9 100644 --- a/base-R.html +++ b/base-R.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -423,7 +423,7 @@ <li><a href="#for-loops" id="toc-for-loops" class="nav-link" data-scroll-target="#for-loops"><span class="header-section-number">27.5</span> <code>for</code> loops</a></li> <li><a href="#plots" id="toc-plots" class="nav-link" data-scroll-target="#plots"><span class="header-section-number">27.6</span> Plots</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">27.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/base-R.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/base-R.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/communicate.html b/communicate.html index 28dbc7958..13fdac9c1 100644 --- a/communicate.html +++ b/communicate.html @@ -79,7 +79,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -641,7 +641,7 @@ <h1 class="title"><span id="sec-communicate-intro" class="quarto-section-identif <div class="nav-footer-left">R para Ciência de Dados (2ª edição) foi escrito por Hadley Wickham, Mine Çetinkaya-Rundel, e Garrett Grolemund.</div> <div class="nav-footer-center"> - <div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/communicate.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> + <div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/communicate.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> <div class="nav-footer-right">Este livro foi contruído com <a href="https://quarto.org/">Quarto</a>.</div> </div> </footer> diff --git a/communication.html b/communication.html index 96034790d..de795d6c8 100644 --- a/communication.html +++ b/communication.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -437,7 +437,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">11.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/communication.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/communication.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/data-import.html b/data-import.html index 4345f5c29..c3b828545 100644 --- a/data-import.html +++ b/data-import.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -422,7 +422,7 @@ <li><a href="#sec-writing-to-a-file" id="toc-sec-writing-to-a-file" class="nav-link" data-scroll-target="#sec-writing-to-a-file"><span class="header-section-number">7.5</span> Writing to a file</a></li> <li><a href="#data-entry" id="toc-data-entry" class="nav-link" data-scroll-target="#data-entry"><span class="header-section-number">7.6</span> Data entry</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">7.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/data-import.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/data-import.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> @@ -835,7 +835,7 @@ <h1 class="title"><span id="sec-data-import" class="quarto-section-identifier">< <span><span class="co">#> # A tibble: 1 × 5</span></span> <span><span class="co">#> row col expected actual file </span></span> <span><span class="co">#> <int> <int> <chr> <chr> <chr> </span></span> -<span><span class="co">#> 1 3 1 a double . /tmp/RtmprQ7oUD/file1e2549f0663d</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> +<span><span class="co">#> 1 3 1 a double . /tmp/RtmpmQKHAo/file1e8c59e6768d</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>This tells us that there was a problem in row 3, col 1 where readr expected a double but got a <code>.</code>. That suggests this dataset uses <code>.</code> for missing values. So then we set <code>na = "."</code>, the automatic guessing succeeds, giving us the numeric column that we want:</p> <div class="cell"> diff --git a/data-tidy.html b/data-tidy.html index d640aa754..87dbe5868 100644 --- a/data-tidy.html +++ b/data-tidy.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -423,7 +423,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">5.5</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/data-tidy.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/data-tidy.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/data-transform.html b/data-transform.html index 5991ba60f..524d48fea 100644 --- a/data-transform.html +++ b/data-transform.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -438,7 +438,7 @@ </li> <li><a href="#sec-sample-size" id="toc-sec-sample-size" class="nav-link" data-scroll-target="#sec-sample-size"><span class="header-section-number">3.6</span> Case study: aggregates and sample size</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">3.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/data-transform.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/data-transform.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/data-visualize.html b/data-visualize.html index 588c4e168..17094d09a 100644 --- a/data-visualize.html +++ b/data-visualize.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -438,7 +438,7 @@ </li> <li><a href="#problemas-comuns" id="toc-problemas-comuns" class="nav-link" data-scroll-target="#problemas-comuns"><span class="header-section-number">1.7</span> Problemas comuns</a></li> <li><a href="#resumo" id="toc-resumo" class="nav-link" data-scroll-target="#resumo"><span class="header-section-number">1.8</span> Resumo</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/data-visualize.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/data-visualize.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/databases.html b/databases.html index b6d6cb3ad..8e11945dc 100644 --- a/databases.html +++ b/databases.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -428,7 +428,7 @@ </li> <li><a href="#sec-sql-expressions" id="toc-sec-sql-expressions" class="nav-link" data-scroll-target="#sec-sql-expressions"><span class="header-section-number">21.6</span> Function translations</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">21.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/databases.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/databases.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> @@ -573,7 +573,7 @@ <h1 class="title"><span id="sec-import-databases" class="quarto-section-identifi <div class="sourceCode" id="cb8"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">diamonds_db</span> <span class="op"><-</span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/tbl.html">tbl</a></span><span class="op">(</span><span class="va">con</span>, <span class="st">"diamonds"</span><span class="op">)</span></span> <span><span class="va">diamonds_db</span></span> <span><span class="co">#> # Source: table<diamonds> [?? x 10]</span></span> -<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]</span></span> +<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]</span></span> <span><span class="co">#> carat cut color clarity depth table price x y z</span></span> <span><span class="co">#> <dbl> <fct> <fct> <fct> <dbl> <dbl> <int> <dbl> <dbl> <dbl></span></span> <span><span class="co">#> 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43</span></span> @@ -610,7 +610,7 @@ <h1 class="title"><span id="sec-import-databases" class="quarto-section-identifi <span></span> <span><span class="va">big_diamonds_db</span></span> <span><span class="co">#> # Source: SQL [?? x 5]</span></span> -<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]</span></span> +<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]</span></span> <span><span class="co">#> carat cut color clarity price</span></span> <span><span class="co">#> <dbl> <fct> <fct> <fct> <int></span></span> <span><span class="co">#> 1 1.54 Premium E VS2 15002</span></span> @@ -838,7 +838,7 @@ <h1 class="title"><span id="sec-import-databases" class="quarto-section-identifi <span><span class="co">#> Use `na.rm = TRUE` to silence this warning</span></span> <span><span class="co">#> This warning is displayed once every 8 hours.</span></span> <span><span class="co">#> # Source: SQL [?? x 2]</span></span> -<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]</span></span> +<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]</span></span> <span><span class="co">#> dest delay</span></span> <span><span class="co">#> <chr> <dbl></span></span> <span><span class="co">#> 1 SFO 2.67</span></span> diff --git a/datetimes.html b/datetimes.html index b6463b45d..360b84290 100644 --- a/datetimes.html +++ b/datetimes.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -431,7 +431,7 @@ </li> <li><a href="#time-zones" id="toc-time-zones" class="nav-link" data-scroll-target="#time-zones"><span class="header-section-number">17.5</span> Time zones</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">17.6</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/datetimes.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/datetimes.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> @@ -448,14 +448,14 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> -</header><div class="cell"> -<div class="cell-output cell-output-stdout"> -<pre><code> -::: {.callout-warning} -Olá! Este capítulo do livro ainda **não está traduzido** para a versão Português-BR. - <br> Caso você queira contribuir com o projeto de tradução, leia as instruções em: <https://github.com/cienciadedatos/pt-r4ds/wiki> <br> -A versão original (em inglês) do livro R for Data Science está disponível em: <https://r4ds.hadley.nz/> -:::</code></pre> +</header><div class="callout callout-style-simple callout-warning"> +<div class="callout-body d-flex"> +<div class="callout-icon-container"> +<i class="callout-icon"></i> +</div> +<div class="callout-body-container"> +<p>Olá! Este capítulo do livro ainda <strong>não está traduzido</strong> para a versão Português-BR. <br> Caso você queira contribuir com o projeto de tradução, leia as instruções em: <a href="https://github.com/cienciadedatos/pt-r4ds/wiki" class="uri">https://github.com/cienciadedatos/pt-r4ds/wiki</a> <br> A versão original (em inglês) do livro R for Data Science está disponível em: <a href="https://r4ds.hadley.nz/" class="uri">https://r4ds.hadley.nz/</a></p> +</div> </div> </div> <section id="introduction" class="level2" data-number="17.1"><h2 data-number="17.1" class="anchored" data-anchor-id="introduction"> @@ -468,7 +468,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.1.1</span> Prerequisites</h3> <p>This chapter will focus on the <strong>lubridate</strong> package, which makes it easier to work with dates and times in R. As of the latest tidyverse release, lubridate is part of core tidyverse. We will also need nycflights13 for practice data.</p> <div class="cell"> -<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://tidyverse.tidyverse.org">tidyverse</a></span><span class="op">)</span></span> +<div class="sourceCode" id="cb1"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://tidyverse.tidyverse.org">tidyverse</a></span><span class="op">)</span></span> <span><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/hadley/nycflights13">nycflights13</a></span><span class="op">)</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> </section></section><section id="sec-creating-datetimes" class="level2" data-number="17.2"><h2 data-number="17.2" class="anchored" data-anchor-id="sec-creating-datetimes"> @@ -483,10 +483,10 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <p>You should always use the simplest possible data type that works for your needs. That means if you can use a date instead of a date-time, you should. Date-times are substantially more complicated because of the need to handle time zones, which we’ll come back to at the end of the chapter.</p> <p>To get the current date or date-time you can use <code><a href="https://lubridate.tidyverse.org/reference/now.html">today()</a></code> or <code><a href="https://lubridate.tidyverse.org/reference/now.html">now()</a></code>:</p> <div class="cell"> -<div class="sourceCode" id="cb3"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">today</a></span><span class="op">(</span><span class="op">)</span></span> +<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">today</a></span><span class="op">(</span><span class="op">)</span></span> <span><span class="co">#> [1] "2023-11-17"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">now</a></span><span class="op">(</span><span class="op">)</span></span> -<span><span class="co">#> [1] "2023-11-17 21:36:55 UTC"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> +<span><span class="co">#> [1] "2023-11-17 22:04:44 UTC"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>Otherwise, the following sections describe the four ways you’re likely to create a date/time:</p> <ul> @@ -499,7 +499,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.2.1</span> During import</h3> <p>If your CSV contains an ISO8601 date or date-time, you don’t need to do anything; readr will automatically recognize it:</p> <div class="cell"> -<div class="sourceCode" id="cb4"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">csv</span> <span class="op"><-</span> <span class="st">"</span></span> +<div class="sourceCode" id="cb3"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">csv</span> <span class="op"><-</span> <span class="st">"</span></span> <span><span class="st"> date,datetime</span></span> <span><span class="st"> 2022-01-02,2022-01-02 05:12</span></span> <span><span class="st">"</span></span> @@ -628,7 +628,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>And this code shows a few options applied to a very ambiguous date:</p> <div class="cell" data-messages="false"> -<div class="sourceCode" id="cb5"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">csv</span> <span class="op"><-</span> <span class="st">"</span></span> +<div class="sourceCode" id="cb4"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">csv</span> <span class="op"><-</span> <span class="st">"</span></span> <span><span class="st"> date</span></span> <span><span class="st"> 01/02/15</span></span> <span><span class="st">"</span></span> @@ -657,7 +657,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.2.2</span> From strings</h3> <p>The date-time specification language is powerful, but requires careful analysis of the date format. An alternative approach is to use lubridate’s helpers which attempt to automatically determine the format once you specify the order of the component. To use them, identify the order in which year, month, and day appear in your dates, then arrange “y”, “m”, and “d” in the same order. That gives you the name of the lubridate function that will parse your date. For example:</p> <div class="cell"> -<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2017-01-31"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb5"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2017-01-31"</span><span class="op">)</span></span> <span><span class="co">#> [1] "2017-01-31"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">mdy</a></span><span class="op">(</span><span class="st">"January 31st, 2017"</span><span class="op">)</span></span> <span><span class="co">#> [1] "2017-01-31"</span></span> @@ -666,14 +666,14 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p><code><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd()</a></code> and friends create dates. To create a date-time, add an underscore and one or more of “h”, “m”, and “s” to the name of the parsing function:</p> <div class="cell"> -<div class="sourceCode" id="cb7"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2017-01-31 20:11:59"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2017-01-31 20:11:59"</span><span class="op">)</span></span> <span><span class="co">#> [1] "2017-01-31 20:11:59 UTC"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">mdy_hm</a></span><span class="op">(</span><span class="st">"01/31/2017 08:01"</span><span class="op">)</span></span> <span><span class="co">#> [1] "2017-01-31 08:01:00 UTC"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>You can also force the creation of a date-time from a date by supplying a timezone:</p> <div class="cell"> -<div class="sourceCode" id="cb8"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2017-01-31"</span>, tz <span class="op">=</span> <span class="st">"UTC"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb7"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2017-01-31"</span>, tz <span class="op">=</span> <span class="st">"UTC"</span><span class="op">)</span></span> <span><span class="co">#> [1] "2017-01-31 UTC"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>Here I use the UTC<a href="#fn3" class="footnote-ref" id="fnref3" role="doc-noteref"><sup>3</sup></a> timezone which you might also know as GMT, or Greenwich Mean Time, the time at 0° longitude<a href="#fn4" class="footnote-ref" id="fnref4" role="doc-noteref"><sup>4</sup></a> . It doesn’t use daylight saving time, making it a bit easier to compute with .</p> @@ -681,7 +681,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.2.3</span> From individual components</h3> <p>Instead of a single string, sometimes you’ll have the individual components of the date-time spread across multiple columns. This is what we have in the <code>flights</code> data:</p> <div class="cell"> -<div class="sourceCode" id="cb9"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb8"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span><span class="op">(</span><span class="va">year</span>, <span class="va">month</span>, <span class="va">day</span>, <span class="va">hour</span>, <span class="va">minute</span><span class="op">)</span></span> <span><span class="co">#> # A tibble: 336,776 × 5</span></span> <span><span class="co">#> year month day hour minute</span></span> @@ -696,7 +696,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>To create a date/time from this sort of input, use <code><a href="https://lubridate.tidyverse.org/reference/make_datetime.html">make_date()</a></code> for dates, or <code><a href="https://lubridate.tidyverse.org/reference/make_datetime.html">make_datetime()</a></code> for date-times:</p> <div class="cell"> -<div class="sourceCode" id="cb10"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb9"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span><span class="op">(</span><span class="va">year</span>, <span class="va">month</span>, <span class="va">day</span>, <span class="va">hour</span>, <span class="va">minute</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span><span class="op">(</span>departure <span class="op">=</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/make_datetime.html">make_datetime</a></span><span class="op">(</span><span class="va">year</span>, <span class="va">month</span>, <span class="va">day</span>, <span class="va">hour</span>, <span class="va">minute</span><span class="op">)</span><span class="op">)</span></span> <span><span class="co">#> # A tibble: 336,776 × 6</span></span> @@ -712,7 +712,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>Let’s do the same thing for each of the four time columns in <code>flights</code>. The times are represented in a slightly odd format, so we use modulus arithmetic to pull out the hour and minute components. Once we’ve created the date-time variables, we focus in on the variables we’ll explore in the rest of the chapter.</p> <div class="cell"> -<div class="sourceCode" id="cb11"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">make_datetime_100</span> <span class="op"><-</span> <span class="kw">function</span><span class="op">(</span><span class="va">year</span>, <span class="va">month</span>, <span class="va">day</span>, <span class="va">time</span><span class="op">)</span> <span class="op">{</span></span> +<div class="sourceCode" id="cb10"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">make_datetime_100</span> <span class="op"><-</span> <span class="kw">function</span><span class="op">(</span><span class="va">year</span>, <span class="va">month</span>, <span class="va">day</span>, <span class="va">time</span><span class="op">)</span> <span class="op">{</span></span> <span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/make_datetime.html">make_datetime</a></span><span class="op">(</span><span class="va">year</span>, <span class="va">month</span>, <span class="va">day</span>, <span class="va">time</span> <span class="op"><a href="https://rdrr.io/r/base/Arithmetic.html">%/%</a></span> <span class="fl">100</span>, <span class="va">time</span> <span class="op"><a href="https://rdrr.io/r/base/Arithmetic.html">%%</a></span> <span class="fl">100</span><span class="op">)</span></span> <span><span class="op">}</span></span> <span></span> @@ -741,7 +741,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>With this data, we can visualize the distribution of departure times across the year:</p> <div class="cell"> -<div class="sourceCode" id="cb12"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb11"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span><span class="op">(</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span><span class="op">(</span>x <span class="op">=</span> <span class="va">dep_time</span><span class="op">)</span><span class="op">)</span> <span class="op">+</span> </span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/geom_histogram.html">geom_freqpoly</a></span><span class="op">(</span>binwidth <span class="op">=</span> <span class="fl">86400</span><span class="op">)</span> <span class="co"># 86400 seconds = 1 day</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> <div class="cell-output-display"> @@ -750,7 +750,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>Or within a single day:</p> <div class="cell"> -<div class="sourceCode" id="cb13"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb12"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span><span class="op">(</span><span class="va">dep_time</span> <span class="op"><</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="fl">20130102</span><span class="op">)</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span><span class="op">(</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span><span class="op">(</span>x <span class="op">=</span> <span class="va">dep_time</span><span class="op">)</span><span class="op">)</span> <span class="op">+</span> </span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/geom_histogram.html">geom_freqpoly</a></span><span class="op">(</span>binwidth <span class="op">=</span> <span class="fl">600</span><span class="op">)</span> <span class="co"># 600 s = 10 minutes</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> @@ -763,14 +763,14 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.2.4</span> From other types</h3> <p>You may want to switch between a date-time and a date. That’s the job of <code><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_datetime()</a></code> and <code><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_date()</a></code>:</p> <div class="cell"> -<div class="sourceCode" id="cb14"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_datetime</a></span><span class="op">(</span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">today</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span> +<div class="sourceCode" id="cb13"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_datetime</a></span><span class="op">(</span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">today</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span> <span><span class="co">#> [1] "2023-11-17 UTC"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_date</a></span><span class="op">(</span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">now</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span> <span><span class="co">#> [1] "2023-11-17"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>Sometimes you’ll get date/times as numeric offsets from the “Unix Epoch”, 1970-01-01. If the offset is in seconds, use <code><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_datetime()</a></code>; if it’s in days, use <code><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_date()</a></code>.</p> <div class="cell"> -<div class="sourceCode" id="cb15"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_datetime</a></span><span class="op">(</span><span class="fl">60</span> <span class="op">*</span> <span class="fl">60</span> <span class="op">*</span> <span class="fl">10</span><span class="op">)</span></span> +<div class="sourceCode" id="cb14"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_datetime</a></span><span class="op">(</span><span class="fl">60</span> <span class="op">*</span> <span class="fl">60</span> <span class="op">*</span> <span class="fl">10</span><span class="op">)</span></span> <span><span class="co">#> [1] "1970-01-01 10:00:00 UTC"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_date</a></span><span class="op">(</span><span class="fl">365</span> <span class="op">*</span> <span class="fl">10</span> <span class="op">+</span> <span class="fl">2</span><span class="op">)</span></span> <span><span class="co">#> [1] "1980-01-01"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> @@ -781,14 +781,14 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <li> <p>What happens if you parse a string that contains invalid dates?</p> <div class="cell"> -<div class="sourceCode" id="cb16"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span><span class="op">(</span><span class="st">"2010-10-10"</span>, <span class="st">"bananas"</span><span class="op">)</span><span class="op">)</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> +<div class="sourceCode" id="cb15"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span><span class="op">(</span><span class="st">"2010-10-10"</span>, <span class="st">"bananas"</span><span class="op">)</span><span class="op">)</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> </li> <li><p>What does the <code>tzone</code> argument to <code><a href="https://lubridate.tidyverse.org/reference/now.html">today()</a></code> do? Why is it important?</p></li> <li> <p>For each of the following date-times, show how you’d parse it using a readr column specification and a lubridate function.</p> <div class="cell"> -<div class="sourceCode" id="cb17"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">d1</span> <span class="op"><-</span> <span class="st">"January 1, 2010"</span></span> +<div class="sourceCode" id="cb16"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">d1</span> <span class="op"><-</span> <span class="st">"January 1, 2010"</span></span> <span><span class="va">d2</span> <span class="op"><-</span> <span class="st">"2015-Mar-07"</span></span> <span><span class="va">d3</span> <span class="op"><-</span> <span class="st">"06-Jun-2017"</span></span> <span><span class="va">d4</span> <span class="op"><-</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span><span class="op">(</span><span class="st">"August 19 (2015)"</span>, <span class="st">"July 1 (2015)"</span><span class="op">)</span></span> @@ -804,7 +804,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.3.1</span> Getting components</h3> <p>You can pull out individual parts of the date with the accessor functions <code><a href="https://lubridate.tidyverse.org/reference/year.html">year()</a></code>, <code><a href="https://lubridate.tidyverse.org/reference/month.html">month()</a></code>, <code><a href="https://lubridate.tidyverse.org/reference/day.html">mday()</a></code> (day of the month), <code><a href="https://lubridate.tidyverse.org/reference/day.html">yday()</a></code> (day of the year), <code><a href="https://lubridate.tidyverse.org/reference/day.html">wday()</a></code> (day of the week), <code><a href="https://lubridate.tidyverse.org/reference/hour.html">hour()</a></code>, <code><a href="https://lubridate.tidyverse.org/reference/minute.html">minute()</a></code>, and <code><a href="https://lubridate.tidyverse.org/reference/second.html">second()</a></code>. These are effectively the opposites of <code><a href="https://lubridate.tidyverse.org/reference/make_datetime.html">make_datetime()</a></code>.</p> <div class="cell"> -<div class="sourceCode" id="cb18"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">datetime</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2026-07-08 12:34:56"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb17"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">datetime</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2026-07-08 12:34:56"</span><span class="op">)</span></span> <span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/year.html">year</a></span><span class="op">(</span><span class="va">datetime</span><span class="op">)</span></span> <span><span class="co">#> [1] 2026</span></span> @@ -820,7 +820,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>For <code><a href="https://lubridate.tidyverse.org/reference/month.html">month()</a></code> and <code><a href="https://lubridate.tidyverse.org/reference/day.html">wday()</a></code> you can set <code>label = TRUE</code> to return the abbreviated name of the month or day of the week. Set <code>abbr = FALSE</code> to return the full name.</p> <div class="cell"> -<div class="sourceCode" id="cb19"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/month.html">month</a></span><span class="op">(</span><span class="va">datetime</span>, label <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></span> +<div class="sourceCode" id="cb18"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/month.html">month</a></span><span class="op">(</span><span class="va">datetime</span>, label <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span></span> <span><span class="co">#> [1] Jul</span></span> <span><span class="co">#> 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/day.html">wday</a></span><span class="op">(</span><span class="va">datetime</span>, label <span class="op">=</span> <span class="cn">TRUE</span>, abbr <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span> @@ -829,7 +829,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>We can use <code><a href="https://lubridate.tidyverse.org/reference/day.html">wday()</a></code> to see that more flights depart during the week than on the weekend:</p> <div class="cell"> -<div class="sourceCode" id="cb20"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb19"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span><span class="op">(</span>wday <span class="op">=</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/day.html">wday</a></span><span class="op">(</span><span class="va">dep_time</span>, label <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span><span class="op">(</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span><span class="op">(</span>x <span class="op">=</span> <span class="va">wday</span><span class="op">)</span><span class="op">)</span> <span class="op">+</span></span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/geom_bar.html">geom_bar</a></span><span class="op">(</span><span class="op">)</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> @@ -839,7 +839,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>We can also look at the average departure delay by minute within the hour. There’s an interesting pattern: flights leaving in minutes 20-30 and 50-60 have much lower delays than the rest of the hour!</p> <div class="cell"> -<div class="sourceCode" id="cb21"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb20"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span><span class="op">(</span>minute <span class="op">=</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/minute.html">minute</a></span><span class="op">(</span><span class="va">dep_time</span><span class="op">)</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span><span class="op">(</span><span class="va">minute</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/summarise.html">summarize</a></span><span class="op">(</span></span> @@ -854,7 +854,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>Interestingly, if we look at the <em>scheduled</em> departure time we don’t see such a strong pattern:</p> <div class="cell"> -<div class="sourceCode" id="cb22"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">sched_dep</span> <span class="op"><-</span> <span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb21"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">sched_dep</span> <span class="op"><-</span> <span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span><span class="op">(</span>minute <span class="op">=</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/minute.html">minute</a></span><span class="op">(</span><span class="va">sched_dep_time</span><span class="op">)</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span><span class="op">(</span><span class="va">minute</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/summarise.html">summarize</a></span><span class="op">(</span></span> @@ -881,7 +881,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.3.2</span> Rounding</h3> <p>An alternative approach to plotting individual components is to round the date to a nearby unit of time, with <code><a href="https://lubridate.tidyverse.org/reference/round_date.html">floor_date()</a></code>, <code><a href="https://lubridate.tidyverse.org/reference/round_date.html">round_date()</a></code>, and <code><a href="https://lubridate.tidyverse.org/reference/round_date.html">ceiling_date()</a></code>. Each function takes a vector of dates to adjust and then the name of the unit to round down (floor), round up (ceiling), or round to. This, for example, allows us to plot the number of flights per week:</p> <div class="cell"> -<div class="sourceCode" id="cb23"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb22"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/count.html">count</a></span><span class="op">(</span>week <span class="op">=</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/round_date.html">floor_date</a></span><span class="op">(</span><span class="va">dep_time</span>, <span class="st">"week"</span><span class="op">)</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span><span class="op">(</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span><span class="op">(</span>x <span class="op">=</span> <span class="va">week</span>, y <span class="op">=</span> <span class="va">n</span><span class="op">)</span><span class="op">)</span> <span class="op">+</span></span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/geom_path.html">geom_line</a></span><span class="op">(</span><span class="op">)</span> <span class="op">+</span> </span> @@ -892,7 +892,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>You can use rounding to show the distribution of flights across the course of a day by computing the difference between <code>dep_time</code> and the earliest instant of that day:</p> <div class="cell"> -<div class="sourceCode" id="cb24"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb23"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span><span class="op">(</span>dep_hour <span class="op">=</span> <span class="va">dep_time</span> <span class="op">-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/round_date.html">floor_date</a></span><span class="op">(</span><span class="va">dep_time</span>, <span class="st">"day"</span><span class="op">)</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span><span class="op">(</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span><span class="op">(</span>x <span class="op">=</span> <span class="va">dep_hour</span><span class="op">)</span><span class="op">)</span> <span class="op">+</span></span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/geom_histogram.html">geom_freqpoly</a></span><span class="op">(</span>binwidth <span class="op">=</span> <span class="fl">60</span> <span class="op">*</span> <span class="fl">30</span><span class="op">)</span></span> @@ -904,7 +904,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>Computing the difference between a pair of date-times yields a difftime (more on that in <a href="#sec-intervals"><span>Seção 17.4.3</span></a>). We can convert that to an <code>hms</code> object to get a more useful x-axis:</p> <div class="cell"> -<div class="sourceCode" id="cb25"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb24"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span><span class="op">(</span>dep_hour <span class="op">=</span> <span class="fu">hms</span><span class="fu">::</span><span class="fu"><a href="https://hms.tidyverse.org/reference/hms.html">as_hms</a></span><span class="op">(</span><span class="va">dep_time</span> <span class="op">-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/round_date.html">floor_date</a></span><span class="op">(</span><span class="va">dep_time</span>, <span class="st">"day"</span><span class="op">)</span><span class="op">)</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span><span class="op">(</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span><span class="op">(</span>x <span class="op">=</span> <span class="va">dep_hour</span><span class="op">)</span><span class="op">)</span> <span class="op">+</span></span> <span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/geom_histogram.html">geom_freqpoly</a></span><span class="op">(</span>binwidth <span class="op">=</span> <span class="fl">60</span> <span class="op">*</span> <span class="fl">30</span><span class="op">)</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> @@ -916,7 +916,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.3.3</span> Modifying components</h3> <p>You can also use each accessor function to modify the components of a date/time. This doesn’t come up much in data analysis, but can be useful when cleaning data that has clearly incorrect dates.</p> <div class="cell"> -<div class="sourceCode" id="cb26"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="op">(</span><span class="va">datetime</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2026-07-08 12:34:56"</span><span class="op">)</span><span class="op">)</span></span> +<div class="sourceCode" id="cb25"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="op">(</span><span class="va">datetime</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2026-07-08 12:34:56"</span><span class="op">)</span><span class="op">)</span></span> <span><span class="co">#> [1] "2026-07-08 12:34:56 UTC"</span></span> <span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/year.html">year</a></span><span class="op">(</span><span class="va">datetime</span><span class="op">)</span> <span class="op"><-</span> <span class="fl">2030</span></span> @@ -931,12 +931,12 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>Alternatively, rather than modifying an existing variable, you can create a new date-time with <code><a href="https://rdrr.io/r/stats/update.html">update()</a></code>. This also allows you to set multiple values in one step:</p> <div class="cell"> -<div class="sourceCode" id="cb27"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/stats/update.html">update</a></span><span class="op">(</span><span class="va">datetime</span>, year <span class="op">=</span> <span class="fl">2030</span>, month <span class="op">=</span> <span class="fl">2</span>, mday <span class="op">=</span> <span class="fl">2</span>, hour <span class="op">=</span> <span class="fl">2</span><span class="op">)</span></span> +<div class="sourceCode" id="cb26"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/stats/update.html">update</a></span><span class="op">(</span><span class="va">datetime</span>, year <span class="op">=</span> <span class="fl">2030</span>, month <span class="op">=</span> <span class="fl">2</span>, mday <span class="op">=</span> <span class="fl">2</span>, hour <span class="op">=</span> <span class="fl">2</span><span class="op">)</span></span> <span><span class="co">#> [1] "2030-02-02 02:34:56 UTC"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>If values are too big, they will roll-over:</p> <div class="cell"> -<div class="sourceCode" id="cb28"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/stats/update.html">update</a></span><span class="op">(</span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2023-02-01"</span><span class="op">)</span>, mday <span class="op">=</span> <span class="fl">30</span><span class="op">)</span></span> +<div class="sourceCode" id="cb27"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/stats/update.html">update</a></span><span class="op">(</span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2023-02-01"</span><span class="op">)</span>, mday <span class="op">=</span> <span class="fl">30</span><span class="op">)</span></span> <span><span class="co">#> [1] "2023-03-02"</span></span> <span><span class="fu"><a href="https://rdrr.io/r/stats/update.html">update</a></span><span class="op">(</span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2023-02-01"</span><span class="op">)</span>, hour <span class="op">=</span> <span class="fl">400</span><span class="op">)</span></span> <span><span class="co">#> [1] "2023-02-17 16:00:00 UTC"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> @@ -967,19 +967,19 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.4.1</span> Durations</h3> <p>In R, when you subtract two dates, you get a difftime object:</p> <div class="cell"> -<div class="sourceCode" id="cb29"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="co"># How old is Hadley?</span></span> +<div class="sourceCode" id="cb28"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="co"># How old is Hadley?</span></span> <span><span class="va">h_age</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">today</a></span><span class="op">(</span><span class="op">)</span> <span class="op">-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"1979-10-14"</span><span class="op">)</span></span> <span><span class="va">h_age</span></span> <span><span class="co">#> Time difference of 16105 days</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>A <code>difftime</code> class object records a time span of seconds, minutes, hours, days, or weeks. This ambiguity can make difftimes a little painful to work with, so lubridate provides an alternative which always uses seconds: the <strong>duration</strong>.</p> <div class="cell"> -<div class="sourceCode" id="cb30"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/as.duration.html">as.duration</a></span><span class="op">(</span><span class="va">h_age</span><span class="op">)</span></span> +<div class="sourceCode" id="cb29"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/as.duration.html">as.duration</a></span><span class="op">(</span><span class="va">h_age</span><span class="op">)</span></span> <span><span class="co">#> [1] "1391472000s (~44.09 years)"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>Durations come with a bunch of convenient constructors:</p> <div class="cell"> -<div class="sourceCode" id="cb31"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dseconds</a></span><span class="op">(</span><span class="fl">15</span><span class="op">)</span></span> +<div class="sourceCode" id="cb30"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dseconds</a></span><span class="op">(</span><span class="fl">15</span><span class="op">)</span></span> <span><span class="co">#> [1] "15s"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dminutes</a></span><span class="op">(</span><span class="fl">10</span><span class="op">)</span></span> <span><span class="co">#> [1] "600s (~10 minutes)"</span></span> @@ -996,19 +996,19 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <p>Durations always record the time span in seconds. Larger units are created by converting minutes, hours, days, weeks, and years to seconds: 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, and 7 days in a week. Larger time units are more problematic. A year uses the “average” number of days in a year, i.e. 365.25. There’s no way to convert a month to a duration, because there’s just too much variation.</p> <p>You can add and multiply durations:</p> <div class="cell"> -<div class="sourceCode" id="cb32"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fl">2</span> <span class="op">*</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dyears</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> +<div class="sourceCode" id="cb31"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fl">2</span> <span class="op">*</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dyears</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> <span><span class="co">#> [1] "63115200s (~2 years)"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dyears</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dweeks</a></span><span class="op">(</span><span class="fl">12</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dhours</a></span><span class="op">(</span><span class="fl">15</span><span class="op">)</span></span> <span><span class="co">#> [1] "38869200s (~1.23 years)"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>You can add and subtract durations to and from days:</p> <div class="cell"> -<div class="sourceCode" id="cb33"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">tomorrow</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">today</a></span><span class="op">(</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">ddays</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> +<div class="sourceCode" id="cb32"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">tomorrow</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">today</a></span><span class="op">(</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">ddays</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> <span><span class="va">last_year</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/now.html">today</a></span><span class="op">(</span><span class="op">)</span> <span class="op">-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dyears</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>However, because durations represent an exact number of seconds, sometimes you might get an unexpected result:</p> <div class="cell"> -<div class="sourceCode" id="cb34"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">one_am</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2026-03-08 01:00:00"</span>, tz <span class="op">=</span> <span class="st">"America/New_York"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb33"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">one_am</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2026-03-08 01:00:00"</span>, tz <span class="op">=</span> <span class="st">"America/New_York"</span><span class="op">)</span></span> <span></span> <span><span class="va">one_am</span></span> <span><span class="co">#> [1] "2026-03-08 01:00:00 EST"</span></span> @@ -1020,14 +1020,14 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <span class="header-section-number">17.4.2</span> Periods</h3> <p>To solve this problem, lubridate provides <strong>periods</strong>. Periods are time spans but don’t have a fixed length in seconds, instead they work with “human” times, like days and months. That allows them to work in a more intuitive way:</p> <div class="cell"> -<div class="sourceCode" id="cb35"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">one_am</span></span> +<div class="sourceCode" id="cb34"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">one_am</span></span> <span><span class="co">#> [1] "2026-03-08 01:00:00 EST"</span></span> <span><span class="va">one_am</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> <span><span class="co">#> [1] "2026-03-09 01:00:00 EDT"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>Like durations, periods can be created with a number of friendly constructor functions.</p> <div class="cell"> -<div class="sourceCode" id="cb36"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">hours</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span><span class="op">(</span><span class="fl">12</span>, <span class="fl">24</span><span class="op">)</span><span class="op">)</span></span> +<div class="sourceCode" id="cb35"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">hours</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span><span class="op">(</span><span class="fl">12</span>, <span class="fl">24</span><span class="op">)</span><span class="op">)</span></span> <span><span class="co">#> [1] "12H 0M 0S" "24H 0M 0S"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">7</span><span class="op">)</span></span> <span><span class="co">#> [1] "7d 0H 0M 0S"</span></span> @@ -1037,14 +1037,14 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>You can add and multiply periods:</p> <div class="cell"> -<div class="sourceCode" id="cb37"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fl">10</span> <span class="op">*</span> <span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/weekday.POSIXt.html">months</a></span><span class="op">(</span><span class="fl">6</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span><span class="op">)</span></span> +<div class="sourceCode" id="cb36"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fl">10</span> <span class="op">*</span> <span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/weekday.POSIXt.html">months</a></span><span class="op">(</span><span class="fl">6</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span><span class="op">)</span></span> <span><span class="co">#> [1] "60m 10d 0H 0M 0S"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">50</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">hours</a></span><span class="op">(</span><span class="fl">25</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">minutes</a></span><span class="op">(</span><span class="fl">2</span><span class="op">)</span></span> <span><span class="co">#> [1] "50d 25H 2M 0S"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>And of course, add them to dates. Compared to durations, periods are more likely to do what you expect:</p> <div class="cell"> -<div class="sourceCode" id="cb38"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="co"># A leap year</span></span> +<div class="sourceCode" id="cb37"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="co"># A leap year</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2024-01-01"</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/duration.html">dyears</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> <span><span class="co">#> [1] "2024-12-31 06:00:00 UTC"</span></span> <span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2024-01-01"</span><span class="op">)</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">years</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> @@ -1058,7 +1058,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>Let’s use periods to fix an oddity related to our flight dates. Some planes appear to have arrived at their destination <em>before</em> they departed from New York City.</p> <div class="cell"> -<div class="sourceCode" id="cb39"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb38"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span><span class="op">(</span><span class="va">arr_time</span> <span class="op"><</span> <span class="va">dep_time</span><span class="op">)</span> </span> <span><span class="co">#> # A tibble: 10,633 × 9</span></span> <span><span class="co">#> origin dest dep_delay arr_delay dep_time sched_dep_time </span></span> @@ -1074,7 +1074,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>These are overnight flights. We used the same date information for both the departure and the arrival times, but these flights arrived on the following day. We can fix this by adding <code>days(1)</code> to the arrival time of each overnight flight.</p> <div class="cell"> -<div class="sourceCode" id="cb40"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op"><-</span> <span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb39"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op"><-</span> <span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span><span class="op">(</span></span> <span> overnight <span class="op">=</span> <span class="va">arr_time</span> <span class="op"><</span> <span class="va">dep_time</span>,</span> <span> arr_time <span class="op">=</span> <span class="va">arr_time</span> <span class="op">+</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="va">overnight</span><span class="op">)</span>,</span> @@ -1083,7 +1083,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>Now all of our flights obey the laws of physics.</p> <div class="cell"> -<div class="sourceCode" id="cb41"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> +<div class="sourceCode" id="cb40"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">flights_dt</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span><span class="op">(</span><span class="va">arr_time</span> <span class="op"><</span> <span class="va">dep_time</span><span class="op">)</span> </span> <span><span class="co">#> # A tibble: 0 × 10</span></span> <span><span class="co">#> # ℹ 10 variables: origin <chr>, dest <chr>, dep_delay <dbl>,</span></span> @@ -1094,13 +1094,13 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <p>What does <code>dyears(1) / ddays(365)</code> return? It’s not quite one, because <code><a href="https://lubridate.tidyverse.org/reference/duration.html">dyears()</a></code> is defined as the number of seconds per average year, which is 365.25 days.</p> <p>What does <code>years(1) / days(1)</code> return? Well, if the year was 2015 it should return 365, but if it was 2016, it should return 366! There’s not quite enough information for lubridate to give a single clear answer. What it does instead is give an estimate:</p> <div class="cell"> -<div class="sourceCode" id="cb42"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">years</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span> <span class="op">/</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> +<div class="sourceCode" id="cb41"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">years</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span> <span class="op">/</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> <span><span class="co">#> [1] 365.25</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>If you want a more accurate measurement, you’ll have to use an <strong>interval</strong>. An interval is a pair of starting and ending date times, or you can think of it as a duration with a starting point.</p> <p>You can create an interval by writing <code>start %--% end</code>:</p> <div class="cell"> -<div class="sourceCode" id="cb43"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">y2023</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2023-01-01"</span><span class="op">)</span> <span class="op"><a href="https://lubridate.tidyverse.org/reference/interval.html">%--%</a></span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2024-01-01"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb42"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">y2023</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2023-01-01"</span><span class="op">)</span> <span class="op"><a href="https://lubridate.tidyverse.org/reference/interval.html">%--%</a></span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2024-01-01"</span><span class="op">)</span></span> <span><span class="va">y2024</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2024-01-01"</span><span class="op">)</span> <span class="op"><a href="https://lubridate.tidyverse.org/reference/interval.html">%--%</a></span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd.html">ymd</a></span><span class="op">(</span><span class="st">"2025-01-01"</span><span class="op">)</span></span> <span></span> <span><span class="va">y2023</span></span> @@ -1110,7 +1110,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>You could then divide it by <code><a href="https://lubridate.tidyverse.org/reference/period.html">days()</a></code> to find out how many days fit in the year:</p> <div class="cell"> -<div class="sourceCode" id="cb44"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">y2023</span> <span class="op">/</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> +<div class="sourceCode" id="cb43"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">y2023</span> <span class="op">/</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> <span><span class="co">#> [1] 365</span></span> <span><span class="va">y2024</span> <span class="op">/</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/period.html">days</a></span><span class="op">(</span><span class="fl">1</span><span class="op">)</span></span> <span><span class="co">#> [1] 366</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> @@ -1130,13 +1130,13 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <p>You might wonder why the time zone uses a city, when typically you think of time zones as associated with a country or region within a country. This is because the IANA database has to record decades worth of time zone rules. Over the course of decades, countries change names (or break apart) fairly frequently, but city names tend to stay the same. Another problem is that the name needs to reflect not only the current behavior, but also the complete history. For example, there are time zones for both “America/New_York” and “America/Detroit”. These cities both currently use Eastern Standard Time but in 1969-1972 Michigan (the state in which Detroit is located), did not follow DST, so it needs a different name. It’s worth reading the raw time zone database (available at <a href="https://www.iana.org/time-zones" class="uri">https://www.iana.org/time-zones</a>) just to read some of these stories!</p> <p>You can find out what R thinks your current time zone is with <code><a href="https://rdrr.io/r/base/timezones.html">Sys.timezone()</a></code>:</p> <div class="cell"> -<div class="sourceCode" id="cb45"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/base/timezones.html">Sys.timezone</a></span><span class="op">(</span><span class="op">)</span></span> +<div class="sourceCode" id="cb44"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/base/timezones.html">Sys.timezone</a></span><span class="op">(</span><span class="op">)</span></span> <span><span class="co">#> [1] "UTC"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>(If R doesn’t know, you’ll get an <code>NA</code>.)</p> <p>And see the complete list of all time zone names with <code><a href="https://rdrr.io/r/base/timezones.html">OlsonNames()</a></code>:</p> <div class="cell"> -<div class="sourceCode" id="cb46"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/base/length.html">length</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/timezones.html">OlsonNames</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span> +<div class="sourceCode" id="cb45"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/base/length.html">length</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/timezones.html">OlsonNames</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span> <span><span class="co">#> [1] 597</span></span> <span><span class="fu"><a href="https://rdrr.io/r/utils/head.html">head</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/timezones.html">OlsonNames</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span></span> <span><span class="co">#> [1] "Africa/Abidjan" "Africa/Accra" "Africa/Addis_Ababa"</span></span> @@ -1144,7 +1144,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>In R, the time zone is an attribute of the date-time that only controls printing. For example, these three objects represent the same instant in time:</p> <div class="cell"> -<div class="sourceCode" id="cb47"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x1</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2024-06-01 12:00:00"</span>, tz <span class="op">=</span> <span class="st">"America/New_York"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb46"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x1</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/ymd_hms.html">ymd_hms</a></span><span class="op">(</span><span class="st">"2024-06-01 12:00:00"</span>, tz <span class="op">=</span> <span class="st">"America/New_York"</span><span class="op">)</span></span> <span><span class="va">x1</span></span> <span><span class="co">#> [1] "2024-06-01 12:00:00 EDT"</span></span> <span></span> @@ -1158,14 +1158,14 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie </div> <p>You can verify that they’re the same time using subtraction:</p> <div class="cell"> -<div class="sourceCode" id="cb48"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x1</span> <span class="op">-</span> <span class="va">x2</span></span> +<div class="sourceCode" id="cb47"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x1</span> <span class="op">-</span> <span class="va">x2</span></span> <span><span class="co">#> Time difference of 0 secs</span></span> <span><span class="va">x1</span> <span class="op">-</span> <span class="va">x3</span></span> <span><span class="co">#> Time difference of 0 secs</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> <p>Unless otherwise specified, lubridate always uses UTC. UTC (Coordinated Universal Time) is the standard time zone used by the scientific community and is roughly equivalent to GMT (Greenwich Mean Time). It does not have DST, which makes a convenient representation for computation. Operations that combine date-times, like <code><a href="https://rdrr.io/r/base/c.html">c()</a></code>, will often drop the time zone. In that case, the date-times will display in the time zone of the first element:</p> <div class="cell"> -<div class="sourceCode" id="cb49"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x4</span> <span class="op"><-</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span><span class="op">(</span><span class="va">x1</span>, <span class="va">x2</span>, <span class="va">x3</span><span class="op">)</span></span> +<div class="sourceCode" id="cb48"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x4</span> <span class="op"><-</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span><span class="op">(</span><span class="va">x1</span>, <span class="va">x2</span>, <span class="va">x3</span><span class="op">)</span></span> <span><span class="va">x4</span></span> <span><span class="co">#> [1] "2024-06-01 12:00:00 EDT" "2024-06-01 12:00:00 EDT"</span></span> <span><span class="co">#> [3] "2024-06-01 12:00:00 EDT"</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> @@ -1175,7 +1175,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <li> <p>Keep the instant in time the same, and change how it’s displayed. Use this when the instant is correct, but you want a more natural display.</p> <div class="cell"> -<div class="sourceCode" id="cb50"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x4a</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/with_tz.html">with_tz</a></span><span class="op">(</span><span class="va">x4</span>, tzone <span class="op">=</span> <span class="st">"Australia/Lord_Howe"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb49"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x4a</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/with_tz.html">with_tz</a></span><span class="op">(</span><span class="va">x4</span>, tzone <span class="op">=</span> <span class="st">"Australia/Lord_Howe"</span><span class="op">)</span></span> <span><span class="va">x4a</span></span> <span><span class="co">#> [1] "2024-06-02 02:30:00 +1030" "2024-06-02 02:30:00 +1030"</span></span> <span><span class="co">#> [3] "2024-06-02 02:30:00 +1030"</span></span> @@ -1188,7 +1188,7 @@ <h1 class="title"><span id="sec-dates-and-times" class="quarto-section-identifie <li> <p>Change the underlying instant in time. Use this when you have an instant that has been labelled with the incorrect time zone, and you need to fix it.</p> <div class="cell"> -<div class="sourceCode" id="cb51"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x4b</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/force_tz.html">force_tz</a></span><span class="op">(</span><span class="va">x4</span>, tzone <span class="op">=</span> <span class="st">"Australia/Lord_Howe"</span><span class="op">)</span></span> +<div class="sourceCode" id="cb50"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">x4b</span> <span class="op"><-</span> <span class="fu"><a href="https://lubridate.tidyverse.org/reference/force_tz.html">force_tz</a></span><span class="op">(</span><span class="va">x4</span>, tzone <span class="op">=</span> <span class="st">"Australia/Lord_Howe"</span><span class="op">)</span></span> <span><span class="va">x4b</span></span> <span><span class="co">#> [1] "2024-06-01 12:00:00 +1030" "2024-06-01 12:00:00 +1030"</span></span> <span><span class="co">#> [3] "2024-06-01 12:00:00 +1030"</span></span> diff --git a/factors.html b/factors.html index e13e7357d..da8be5eec 100644 --- a/factors.html +++ b/factors.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -422,7 +422,7 @@ </li> <li><a href="#sec-ordered-factors" id="toc-sec-ordered-factors" class="nav-link" data-scroll-target="#sec-ordered-factors"><span class="header-section-number">16.6</span> Ordered factors</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">16.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/factors.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/factors.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/functions.html b/functions.html index 491ad9b50..8691b1fdb 100644 --- a/functions.html +++ b/functions.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -437,7 +437,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">25.6</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/functions.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/functions.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/import.html b/import.html index b97563016..059de9ca6 100644 --- a/import.html +++ b/import.html @@ -79,7 +79,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -644,7 +644,7 @@ <h1 class="title"><span id="sec-import" class="quarto-section-identifier">Import <div class="nav-footer-left">R para Ciência de Dados (2ª edição) foi escrito por Hadley Wickham, Mine Çetinkaya-Rundel, e Garrett Grolemund.</div> <div class="nav-footer-center"> - <div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/import.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> + <div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/import.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> <div class="nav-footer-right">Este livro foi contruído com <a href="https://quarto.org/">Quarto</a>.</div> </div> </footer> diff --git a/index.html b/index.html index e6d1a854f..aa43c172b 100644 --- a/index.html +++ b/index.html @@ -93,7 +93,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -382,7 +382,7 @@ <h2 id="toc-title">Índice</h2> <li><a href="#agradecimentos" id="toc-agradecimentos" class="nav-link" data-scroll-target="#agradecimentos">Agradecimentos</a></li> </ul></li> </ul> -<div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/index.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> +<div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/index.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"> diff --git a/intro.html b/intro.html index 7658a245e..8a245fb88 100644 --- a/intro.html +++ b/intro.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -417,7 +417,7 @@ <li><a href="#executando-c%C3%B3digo-em-r" id="toc-executando-código-em-r" class="nav-link" data-scroll-target="#executando-c%C3%B3digo-em-r">Executando código em R</a></li> <li><a href="#agradecimentos" id="toc-agradecimentos" class="nav-link" data-scroll-target="#agradecimentos">Agradecimentos</a></li> <li><a href="#considera%C3%A7%C3%B5es-finais" id="toc-considerações-finais" class="nav-link" data-scroll-target="#considera%C3%A7%C3%B5es-finais">Considerações Finais</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/intro.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/intro.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/iteration.html b/iteration.html index 6676656e4..51d63a006 100644 --- a/iteration.html +++ b/iteration.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -436,7 +436,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">26.5</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/iteration.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/iteration.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> @@ -1274,7 +1274,7 @@ <h1 class="title"><span id="sec-iteration" class="quarto-section-identifier"><sp <div class="cell"> <div class="sourceCode" id="cb54"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="va">con</span> <span class="op">|></span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/tbl.html">tbl</a></span><span class="op">(</span><span class="st">"gapminder"</span><span class="op">)</span></span> <span><span class="co">#> # Source: table<gapminder> [0 x 6]</span></span> -<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]</span></span> +<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]</span></span> <span><span class="co">#> # ℹ 6 variables: country <chr>, continent <chr>, lifeExp <dbl>, pop <dbl>,</span></span> <span><span class="co">#> # gdpPercap <dbl>, year <dbl></span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> </div> @@ -1301,13 +1301,13 @@ <h1 class="title"><span id="sec-iteration" class="quarto-section-identifier"><sp <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/tbl.html">tbl</a></span><span class="op">(</span><span class="st">"gapminder"</span><span class="op">)</span> <span class="op">|></span> </span> <span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/count.html">count</a></span><span class="op">(</span><span class="va">year</span><span class="op">)</span></span> <span><span class="co">#> # Source: SQL [?? x 2]</span></span> -<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]</span></span> +<span><span class="co">#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]</span></span> <span><span class="co">#> year n</span></span> <span><span class="co">#> <dbl> <dbl></span></span> -<span><span class="co">#> 1 1967 142</span></span> -<span><span class="co">#> 2 1977 142</span></span> -<span><span class="co">#> 3 1987 142</span></span> -<span><span class="co">#> 4 2007 142</span></span> +<span><span class="co">#> 1 2007 142</span></span> +<span><span class="co">#> 2 1967 142</span></span> +<span><span class="co">#> 3 1977 142</span></span> +<span><span class="co">#> 4 1987 142</span></span> <span><span class="co">#> 5 1952 142</span></span> <span><span class="co">#> 6 1957 142</span></span> <span><span class="co">#> # ℹ more rows</span></span></code><button title="Copiar para a área de transferência" class="code-copy-button"><i class="bi"></i></button></pre></div> diff --git a/joins.html b/joins.html index cffc13b46..c93d61445 100644 --- a/joins.html +++ b/joins.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -437,7 +437,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">19.6</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/joins.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/joins.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/layers.html b/layers.html index d4c8e40c8..9e4841f2b 100644 --- a/layers.html +++ b/layers.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -439,7 +439,7 @@ </li> <li><a href="#the-layered-grammar-of-graphics" id="toc-the-layered-grammar-of-graphics" class="nav-link" data-scroll-target="#the-layered-grammar-of-graphics"><span class="header-section-number">9.8</span> The layered grammar of graphics</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">9.9</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/layers.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/layers.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/logicals.html b/logicals.html index d2e02c4ae..2df1bd9c0 100644 --- a/logicals.html +++ b/logicals.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -438,7 +438,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">12.6</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/logicals.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/logicals.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/missing-values.html b/missing-values.html index 27418188a..dfae5452f 100644 --- a/missing-values.html +++ b/missing-values.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -420,7 +420,7 @@ </li> <li><a href="#factors-and-empty-groups" id="toc-factors-and-empty-groups" class="nav-link" data-scroll-target="#factors-and-empty-groups"><span class="header-section-number">18.4</span> Factors and empty groups</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">18.5</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/missing-values.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/missing-values.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/numbers.html b/numbers.html index 2c1897a0f..633e78bd1 100644 --- a/numbers.html +++ b/numbers.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -443,7 +443,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">13.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/numbers.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/numbers.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/preface-2e.html b/preface-2e.html index 6c3c29235..c7be4c855 100644 --- a/preface-2e.html +++ b/preface-2e.html @@ -94,7 +94,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -662,7 +662,7 @@ <h1 class="title">Prefácio da segunda edição</h1> <div class="nav-footer-left">R para Ciência de Dados (2ª edição) foi escrito por Hadley Wickham, Mine Çetinkaya-Rundel, e Garrett Grolemund.</div> <div class="nav-footer-center"> - <div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/preface-2e.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> + <div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/preface-2e.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> <div class="nav-footer-right">Este livro foi contruído com <a href="https://quarto.org/">Quarto</a>.</div> </div> </footer> diff --git a/program.html b/program.html index 77efe1e7d..3760c678a 100644 --- a/program.html +++ b/program.html @@ -79,7 +79,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -643,7 +643,7 @@ <h1 class="title"><span id="sec-program-intro" class="quarto-section-identifier" <div class="nav-footer-left">R para Ciência de Dados (2ª edição) foi escrito por Hadley Wickham, Mine Çetinkaya-Rundel, e Garrett Grolemund.</div> <div class="nav-footer-center"> - <div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/program.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> + <div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/program.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> <div class="nav-footer-right">Este livro foi contruído com <a href="https://quarto.org/">Quarto</a>.</div> </div> </footer> diff --git a/quarto-formats.html b/quarto-formats.html index e6848cbed..dae0c3b29 100644 --- a/quarto-formats.html +++ b/quarto-formats.html @@ -115,7 +115,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -411,7 +411,7 @@ <li><a href="#websites-and-books" id="toc-websites-and-books" class="nav-link" data-scroll-target="#websites-and-books"><span class="header-section-number">29.6</span> Websites and books</a></li> <li><a href="#other-formats" id="toc-other-formats" class="nav-link" data-scroll-target="#other-formats"><span class="header-section-number">29.7</span> Other formats</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">29.8</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/quarto-formats.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/quarto-formats.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/quarto.html b/quarto.html index 2a01bf24f..d3bb96edb 100644 --- a/quarto.html +++ b/quarto.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -460,7 +460,7 @@ </li> <li><a href="#workflow" id="toc-workflow" class="nav-link" data-scroll-target="#workflow"><span class="header-section-number">28.11</span> Workflow</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">28.12</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/quarto.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/quarto.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/rectangling.html b/rectangling.html index 871bb6bee..813556b51 100644 --- a/rectangling.html +++ b/rectangling.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -437,7 +437,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">23.6</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/rectangling.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/rectangling.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/regexps.html b/regexps.html index f389bb0eb..60f069c12 100644 --- a/regexps.html +++ b/regexps.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -448,7 +448,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">15.8</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/regexps.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/regexps.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/robots.txt b/robots.txt index b1fc7ef84..0ac8fbe39 100644 --- a/robots.txt +++ b/robots.txt @@ -1 +1 @@ -Sitemap: https://r4ds.hadley.nz/sitemap.xml +Sitemap: https://cienciadedatos.github.io/pt-r4ds/sitemap.xml diff --git a/search.json b/search.json index 73a5d3ed7..eb09f5f12 100644 --- a/search.json +++ b/search.json @@ -396,7 +396,7 @@ "href": "data-import.html#sec-col-types", "title": "7 Data import", "section": "\n7.3 Controlling column types", - "text": "7.3 Controlling column types\nA CSV file doesn’t contain any information about the type of each variable (i.e. whether it’s a logical, number, string, etc.), so readr will try to guess the type. This section describes how the guessing process works, how to resolve some common problems that cause it to fail, and, if needed, how to supply the column types yourself. Finally, we’ll mention a few general strategies that are useful if readr is failing catastrophically and you need to get more insight into the structure of your file.\n\n7.3.1 Guessing types\nreadr uses a heuristic to figure out the column types. For each column, it pulls the values of 1,0002 rows spaced evenly from the first row to the last, ignoring missing values. It then works through the following questions:\n\nDoes it contain only F, T, FALSE, or TRUE (ignoring case)? If so, it’s a logical.\nDoes it contain only numbers (e.g., 1, -4.5, 5e6, Inf)? If so, it’s a number.\nDoes it match the ISO8601 standard? If so, it’s a date or date-time. (We’ll return to date-times in more detail in Seção 17.2).\nOtherwise, it must be a string.\n\nYou can see that behavior in action in this simple example:\n\nread_csv(\"\n logical,numeric,date,string\n TRUE,1,2021-01-15,abc\n false,4.5,2021-02-15,def\n T,Inf,2021-02-16,ghi\n\")\n#> # A tibble: 3 × 4\n#> logical numeric date string\n#> <lgl> <dbl> <date> <chr> \n#> 1 TRUE 1 2021-01-15 abc \n#> 2 FALSE 4.5 2021-02-15 def \n#> 3 TRUE Inf 2021-02-16 ghi\n\nThis heuristic works well if you have a clean dataset, but in real life, you’ll encounter a selection of weird and beautiful failures.\n\n7.3.2 Missing values, column types, and problems\nThe most common way column detection fails is that a column contains unexpected values, and you get a character column instead of a more specific type. One of the most common causes for this is a missing value, recorded using something other than the NA that readr expects.\nTake this simple 1 column CSV file as an example:\n\nsimple_csv <- \"\n x\n 10\n .\n 20\n 30\"\n\nIf we read it without any additional arguments, x becomes a character column:\n\nread_csv(simple_csv)\n#> # A tibble: 4 × 1\n#> x \n#> <chr>\n#> 1 10 \n#> 2 . \n#> 3 20 \n#> 4 30\n\nIn this very small case, you can easily see the missing value .. But what happens if you have thousands of rows with only a few missing values represented by .s sprinkled among them? One approach is to tell readr that x is a numeric column, and then see where it fails. You can do that with the col_types argument, which takes a named list where the names match the column names in the CSV file:\n\ndf <- read_csv(\n simple_csv, \n col_types = list(x = col_double())\n)\n#> Warning: One or more parsing issues, call `problems()` on your data frame for\n#> details, e.g.:\n#> dat <- vroom(...)\n#> problems(dat)\n\nNow read_csv() reports that there was a problem, and tells us we can find out more with problems():\n\nproblems(df)\n#> # A tibble: 1 × 5\n#> row col expected actual file \n#> <int> <int> <chr> <chr> <chr> \n#> 1 3 1 a double . /tmp/RtmprQ7oUD/file1e2549f0663d\n\nThis tells us that there was a problem in row 3, col 1 where readr expected a double but got a .. That suggests this dataset uses . for missing values. So then we set na = \".\", the automatic guessing succeeds, giving us the numeric column that we want:\n\nread_csv(simple_csv, na = \".\")\n#> # A tibble: 4 × 1\n#> x\n#> <dbl>\n#> 1 10\n#> 2 NA\n#> 3 20\n#> 4 30\n\n\n7.3.3 Column types\nreadr provides a total of nine column types for you to use:\n\n\ncol_logical() and col_double() read logicals and real numbers. They’re relatively rarely needed (except as above), since readr will usually guess them for you.\n\ncol_integer() reads integers. We seldom distinguish integers and doubles in this book because they’re functionally equivalent, but reading integers explicitly can occasionally be useful because they occupy half the memory of doubles.\n\ncol_character() reads strings. This can be useful to specify explicitly when you have a column that is a numeric identifier, i.e., long series of digits that identifies an object but doesn’t make sense to apply mathematical operations to. Examples include phone numbers, social security numbers, credit card numbers, etc.\n\ncol_factor(), col_date(), and col_datetime() create factors, dates, and date-times respectively; you’ll learn more about those when we get to those data types in Capítulo 16 and Capítulo 17.\n\ncol_number() is a permissive numeric parser that will ignore non-numeric components, and is particularly useful for currencies. You’ll learn more about it in Capítulo 13.\n\ncol_skip() skips a column so it’s not included in the result, which can be useful for speeding up reading the data if you have a large CSV file and you only want to use some of the columns.\n\nIt’s also possible to override the default column by switching from list() to cols() and specifying .default:\n\nanother_csv <- \"\nx,y,z\n1,2,3\"\n\nread_csv(\n another_csv, \n col_types = cols(.default = col_character())\n)\n#> # A tibble: 1 × 3\n#> x y z \n#> <chr> <chr> <chr>\n#> 1 1 2 3\n\nAnother useful helper is cols_only() which will read in only the columns you specify:\n\nread_csv(\n another_csv,\n col_types = cols_only(x = col_character())\n)\n#> # A tibble: 1 × 1\n#> x \n#> <chr>\n#> 1 1" + "text": "7.3 Controlling column types\nA CSV file doesn’t contain any information about the type of each variable (i.e. whether it’s a logical, number, string, etc.), so readr will try to guess the type. This section describes how the guessing process works, how to resolve some common problems that cause it to fail, and, if needed, how to supply the column types yourself. Finally, we’ll mention a few general strategies that are useful if readr is failing catastrophically and you need to get more insight into the structure of your file.\n\n7.3.1 Guessing types\nreadr uses a heuristic to figure out the column types. For each column, it pulls the values of 1,0002 rows spaced evenly from the first row to the last, ignoring missing values. It then works through the following questions:\n\nDoes it contain only F, T, FALSE, or TRUE (ignoring case)? If so, it’s a logical.\nDoes it contain only numbers (e.g., 1, -4.5, 5e6, Inf)? If so, it’s a number.\nDoes it match the ISO8601 standard? If so, it’s a date or date-time. (We’ll return to date-times in more detail in Seção 17.2).\nOtherwise, it must be a string.\n\nYou can see that behavior in action in this simple example:\n\nread_csv(\"\n logical,numeric,date,string\n TRUE,1,2021-01-15,abc\n false,4.5,2021-02-15,def\n T,Inf,2021-02-16,ghi\n\")\n#> # A tibble: 3 × 4\n#> logical numeric date string\n#> <lgl> <dbl> <date> <chr> \n#> 1 TRUE 1 2021-01-15 abc \n#> 2 FALSE 4.5 2021-02-15 def \n#> 3 TRUE Inf 2021-02-16 ghi\n\nThis heuristic works well if you have a clean dataset, but in real life, you’ll encounter a selection of weird and beautiful failures.\n\n7.3.2 Missing values, column types, and problems\nThe most common way column detection fails is that a column contains unexpected values, and you get a character column instead of a more specific type. One of the most common causes for this is a missing value, recorded using something other than the NA that readr expects.\nTake this simple 1 column CSV file as an example:\n\nsimple_csv <- \"\n x\n 10\n .\n 20\n 30\"\n\nIf we read it without any additional arguments, x becomes a character column:\n\nread_csv(simple_csv)\n#> # A tibble: 4 × 1\n#> x \n#> <chr>\n#> 1 10 \n#> 2 . \n#> 3 20 \n#> 4 30\n\nIn this very small case, you can easily see the missing value .. But what happens if you have thousands of rows with only a few missing values represented by .s sprinkled among them? One approach is to tell readr that x is a numeric column, and then see where it fails. You can do that with the col_types argument, which takes a named list where the names match the column names in the CSV file:\n\ndf <- read_csv(\n simple_csv, \n col_types = list(x = col_double())\n)\n#> Warning: One or more parsing issues, call `problems()` on your data frame for\n#> details, e.g.:\n#> dat <- vroom(...)\n#> problems(dat)\n\nNow read_csv() reports that there was a problem, and tells us we can find out more with problems():\n\nproblems(df)\n#> # A tibble: 1 × 5\n#> row col expected actual file \n#> <int> <int> <chr> <chr> <chr> \n#> 1 3 1 a double . /tmp/RtmpmQKHAo/file1e8c59e6768d\n\nThis tells us that there was a problem in row 3, col 1 where readr expected a double but got a .. That suggests this dataset uses . for missing values. So then we set na = \".\", the automatic guessing succeeds, giving us the numeric column that we want:\n\nread_csv(simple_csv, na = \".\")\n#> # A tibble: 4 × 1\n#> x\n#> <dbl>\n#> 1 10\n#> 2 NA\n#> 3 20\n#> 4 30\n\n\n7.3.3 Column types\nreadr provides a total of nine column types for you to use:\n\n\ncol_logical() and col_double() read logicals and real numbers. They’re relatively rarely needed (except as above), since readr will usually guess them for you.\n\ncol_integer() reads integers. We seldom distinguish integers and doubles in this book because they’re functionally equivalent, but reading integers explicitly can occasionally be useful because they occupy half the memory of doubles.\n\ncol_character() reads strings. This can be useful to specify explicitly when you have a column that is a numeric identifier, i.e., long series of digits that identifies an object but doesn’t make sense to apply mathematical operations to. Examples include phone numbers, social security numbers, credit card numbers, etc.\n\ncol_factor(), col_date(), and col_datetime() create factors, dates, and date-times respectively; you’ll learn more about those when we get to those data types in Capítulo 16 and Capítulo 17.\n\ncol_number() is a permissive numeric parser that will ignore non-numeric components, and is particularly useful for currencies. You’ll learn more about it in Capítulo 13.\n\ncol_skip() skips a column so it’s not included in the result, which can be useful for speeding up reading the data if you have a large CSV file and you only want to use some of the columns.\n\nIt’s also possible to override the default column by switching from list() to cols() and specifying .default:\n\nanother_csv <- \"\nx,y,z\n1,2,3\"\n\nread_csv(\n another_csv, \n col_types = cols(.default = col_character())\n)\n#> # A tibble: 1 × 3\n#> x y z \n#> <chr> <chr> <chr>\n#> 1 1 2 3\n\nAnother useful helper is cols_only() which will read in only the columns you specify:\n\nread_csv(\n another_csv,\n col_types = cols_only(x = col_character())\n)\n#> # A tibble: 1 × 1\n#> x \n#> <chr>\n#> 1 1" }, { "objectID": "data-import.html#sec-readr-directory", @@ -942,7 +942,7 @@ "href": "datetimes.html#sec-creating-datetimes", "title": "17 Dates and times", "section": "\n17.2 Creating date/times", - "text": "17.2 Creating date/times\nThere are three types of date/time data that refer to an instant in time:\n\nA date. Tibbles print this as <date>.\nA time within a day. Tibbles print this as <time>.\nA date-time is a date plus a time: it uniquely identifies an instant in time (typically to the nearest second). Tibbles print this as <dttm>. Base R calls these POSIXct, but doesn’t exactly trip off the tongue.\n\nIn this chapter we are going to focus on dates and date-times as R doesn’t have a native class for storing times. If you need one, you can use the hms package.\nYou should always use the simplest possible data type that works for your needs. That means if you can use a date instead of a date-time, you should. Date-times are substantially more complicated because of the need to handle time zones, which we’ll come back to at the end of the chapter.\nTo get the current date or date-time you can use today() or now():\n\ntoday()\n#> [1] \"2023-11-17\"\nnow()\n#> [1] \"2023-11-17 21:36:55 UTC\"\n\nOtherwise, the following sections describe the four ways you’re likely to create a date/time:\n\nWhile reading a file with readr.\nFrom a string.\nFrom individual date-time components.\nFrom an existing date/time object.\n\n\n17.2.1 During import\nIf your CSV contains an ISO8601 date or date-time, you don’t need to do anything; readr will automatically recognize it:\n\ncsv <- \"\n date,datetime\n 2022-01-02,2022-01-02 05:12\n\"\nread_csv(csv)\n#> # A tibble: 1 × 2\n#> date datetime \n#> <date> <dttm> \n#> 1 2022-01-02 2022-01-02 05:12:00\n\nIf you haven’t heard of ISO8601 before, it’s an international standard2 for writing dates where the components of a date are organized from biggest to smallest separated by -. For example, in ISO8601 May 3 2022 is 2022-05-03. ISO8601 dates can also include times, where hour, minute, and second are separated by :, and the date and time components are separated by either a T or a space. For example, you could write 4:26pm on May 3 2022 as either 2022-05-03 16:26 or 2022-05-03T16:26.\nFor other date-time formats, you’ll need to use col_types plus col_date() or col_datetime() along with a date-time format. The date-time format used by readr is a standard used across many programming languages, describing a date component with a % followed by a single character. For example, %Y-%m-%d specifies a date that’s a year, -, month (as number) -, day. Table Tabela 17.1 lists all the options.\n\n\nTabela 17.1: All date formats understood by readr\n\nType\nCode\nMeaning\nExample\n\n\n\nYear\n%Y\n4 digit year\n2021\n\n\n\n%y\n2 digit year\n21\n\n\nMonth\n%m\nNumber\n2\n\n\n\n%b\nAbbreviated name\nFeb\n\n\n\n%B\nFull name\nFebruary\n\n\nDay\n%d\nOne or two digits\n2\n\n\n\n%e\nTwo digits\n02\n\n\nTime\n%H\n24-hour hour\n13\n\n\n\n%I\n12-hour hour\n1\n\n\n\n%p\nAM/PM\npm\n\n\n\n%M\nMinutes\n35\n\n\n\n%S\nSeconds\n45\n\n\n\n%OS\nSeconds with decimal component\n45.35\n\n\n\n%Z\nTime zone name\nAmerica/Chicago\n\n\n\n%z\nOffset from UTC\n+0800\n\n\nOther\n%.\nSkip one non-digit\n:\n\n\n\n%*\nSkip any number of non-digits\n\n\n\n\n\nAnd this code shows a few options applied to a very ambiguous date:\n\ncsv <- \"\n date\n 01/02/15\n\"\n\nread_csv(csv, col_types = cols(date = col_date(\"%m/%d/%y\")))\n#> # A tibble: 1 × 1\n#> date \n#> <date> \n#> 1 2015-01-02\n\nread_csv(csv, col_types = cols(date = col_date(\"%d/%m/%y\")))\n#> # A tibble: 1 × 1\n#> date \n#> <date> \n#> 1 2015-02-01\n\nread_csv(csv, col_types = cols(date = col_date(\"%y/%m/%d\")))\n#> # A tibble: 1 × 1\n#> date \n#> <date> \n#> 1 2001-02-15\n\nNote that no matter how you specify the date format, it’s always displayed the same way once you get it into R.\nIf you’re using %b or %B and working with non-English dates, you’ll also need to provide a locale(). See the list of built-in languages in date_names_langs(), or create your own with date_names(),\n\n17.2.2 From strings\nThe date-time specification language is powerful, but requires careful analysis of the date format. An alternative approach is to use lubridate’s helpers which attempt to automatically determine the format once you specify the order of the component. To use them, identify the order in which year, month, and day appear in your dates, then arrange “y”, “m”, and “d” in the same order. That gives you the name of the lubridate function that will parse your date. For example:\n\nymd(\"2017-01-31\")\n#> [1] \"2017-01-31\"\nmdy(\"January 31st, 2017\")\n#> [1] \"2017-01-31\"\ndmy(\"31-Jan-2017\")\n#> [1] \"2017-01-31\"\n\nymd() and friends create dates. To create a date-time, add an underscore and one or more of “h”, “m”, and “s” to the name of the parsing function:\n\nymd_hms(\"2017-01-31 20:11:59\")\n#> [1] \"2017-01-31 20:11:59 UTC\"\nmdy_hm(\"01/31/2017 08:01\")\n#> [1] \"2017-01-31 08:01:00 UTC\"\n\nYou can also force the creation of a date-time from a date by supplying a timezone:\n\nymd(\"2017-01-31\", tz = \"UTC\")\n#> [1] \"2017-01-31 UTC\"\n\nHere I use the UTC3 timezone which you might also know as GMT, or Greenwich Mean Time, the time at 0° longitude4 . It doesn’t use daylight saving time, making it a bit easier to compute with .\n\n17.2.3 From individual components\nInstead of a single string, sometimes you’ll have the individual components of the date-time spread across multiple columns. This is what we have in the flights data:\n\nflights |> \n select(year, month, day, hour, minute)\n#> # A tibble: 336,776 × 5\n#> year month day hour minute\n#> <int> <int> <int> <dbl> <dbl>\n#> 1 2013 1 1 5 15\n#> 2 2013 1 1 5 29\n#> 3 2013 1 1 5 40\n#> 4 2013 1 1 5 45\n#> 5 2013 1 1 6 0\n#> 6 2013 1 1 5 58\n#> # ℹ 336,770 more rows\n\nTo create a date/time from this sort of input, use make_date() for dates, or make_datetime() for date-times:\n\nflights |> \n select(year, month, day, hour, minute) |> \n mutate(departure = make_datetime(year, month, day, hour, minute))\n#> # A tibble: 336,776 × 6\n#> year month day hour minute departure \n#> <int> <int> <int> <dbl> <dbl> <dttm> \n#> 1 2013 1 1 5 15 2013-01-01 05:15:00\n#> 2 2013 1 1 5 29 2013-01-01 05:29:00\n#> 3 2013 1 1 5 40 2013-01-01 05:40:00\n#> 4 2013 1 1 5 45 2013-01-01 05:45:00\n#> 5 2013 1 1 6 0 2013-01-01 06:00:00\n#> 6 2013 1 1 5 58 2013-01-01 05:58:00\n#> # ℹ 336,770 more rows\n\nLet’s do the same thing for each of the four time columns in flights. The times are represented in a slightly odd format, so we use modulus arithmetic to pull out the hour and minute components. Once we’ve created the date-time variables, we focus in on the variables we’ll explore in the rest of the chapter.\n\nmake_datetime_100 <- function(year, month, day, time) {\n make_datetime(year, month, day, time %/% 100, time %% 100)\n}\n\nflights_dt <- flights |> \n filter(!is.na(dep_time), !is.na(arr_time)) |> \n mutate(\n dep_time = make_datetime_100(year, month, day, dep_time),\n arr_time = make_datetime_100(year, month, day, arr_time),\n sched_dep_time = make_datetime_100(year, month, day, sched_dep_time),\n sched_arr_time = make_datetime_100(year, month, day, sched_arr_time)\n ) |> \n select(origin, dest, ends_with(\"delay\"), ends_with(\"time\"))\n\nflights_dt\n#> # A tibble: 328,063 × 9\n#> origin dest dep_delay arr_delay dep_time sched_dep_time \n#> <chr> <chr> <dbl> <dbl> <dttm> <dttm> \n#> 1 EWR IAH 2 11 2013-01-01 05:17:00 2013-01-01 05:15:00\n#> 2 LGA IAH 4 20 2013-01-01 05:33:00 2013-01-01 05:29:00\n#> 3 JFK MIA 2 33 2013-01-01 05:42:00 2013-01-01 05:40:00\n#> 4 JFK BQN -1 -18 2013-01-01 05:44:00 2013-01-01 05:45:00\n#> 5 LGA ATL -6 -25 2013-01-01 05:54:00 2013-01-01 06:00:00\n#> 6 EWR ORD -4 12 2013-01-01 05:54:00 2013-01-01 05:58:00\n#> # ℹ 328,057 more rows\n#> # ℹ 3 more variables: arr_time <dttm>, sched_arr_time <dttm>, …\n\nWith this data, we can visualize the distribution of departure times across the year:\n\nflights_dt |> \n ggplot(aes(x = dep_time)) + \n geom_freqpoly(binwidth = 86400) # 86400 seconds = 1 day\n\n\n\n\nOr within a single day:\n\nflights_dt |> \n filter(dep_time < ymd(20130102)) |> \n ggplot(aes(x = dep_time)) + \n geom_freqpoly(binwidth = 600) # 600 s = 10 minutes\n\n\n\n\nNote that when you use date-times in a numeric context (like in a histogram), 1 means 1 second, so a binwidth of 86400 means one day. For dates, 1 means 1 day.\n\n17.2.4 From other types\nYou may want to switch between a date-time and a date. That’s the job of as_datetime() and as_date():\n\nas_datetime(today())\n#> [1] \"2023-11-17 UTC\"\nas_date(now())\n#> [1] \"2023-11-17\"\n\nSometimes you’ll get date/times as numeric offsets from the “Unix Epoch”, 1970-01-01. If the offset is in seconds, use as_datetime(); if it’s in days, use as_date().\n\nas_datetime(60 * 60 * 10)\n#> [1] \"1970-01-01 10:00:00 UTC\"\nas_date(365 * 10 + 2)\n#> [1] \"1980-01-01\"\n\n\n17.2.5 Exercises\n\n\nWhat happens if you parse a string that contains invalid dates?\n\nymd(c(\"2010-10-10\", \"bananas\"))\n\n\nWhat does the tzone argument to today() do? Why is it important?\n\nFor each of the following date-times, show how you’d parse it using a readr column specification and a lubridate function.\n\nd1 <- \"January 1, 2010\"\nd2 <- \"2015-Mar-07\"\nd3 <- \"06-Jun-2017\"\nd4 <- c(\"August 19 (2015)\", \"July 1 (2015)\")\nd5 <- \"12/30/14\" # Dec 30, 2014\nt1 <- \"1705\"\nt2 <- \"11:15:10.12 PM\"" + "text": "17.2 Creating date/times\nThere are three types of date/time data that refer to an instant in time:\n\nA date. Tibbles print this as <date>.\nA time within a day. Tibbles print this as <time>.\nA date-time is a date plus a time: it uniquely identifies an instant in time (typically to the nearest second). Tibbles print this as <dttm>. Base R calls these POSIXct, but doesn’t exactly trip off the tongue.\n\nIn this chapter we are going to focus on dates and date-times as R doesn’t have a native class for storing times. If you need one, you can use the hms package.\nYou should always use the simplest possible data type that works for your needs. That means if you can use a date instead of a date-time, you should. Date-times are substantially more complicated because of the need to handle time zones, which we’ll come back to at the end of the chapter.\nTo get the current date or date-time you can use today() or now():\n\ntoday()\n#> [1] \"2023-11-17\"\nnow()\n#> [1] \"2023-11-17 22:04:44 UTC\"\n\nOtherwise, the following sections describe the four ways you’re likely to create a date/time:\n\nWhile reading a file with readr.\nFrom a string.\nFrom individual date-time components.\nFrom an existing date/time object.\n\n\n17.2.1 During import\nIf your CSV contains an ISO8601 date or date-time, you don’t need to do anything; readr will automatically recognize it:\n\ncsv <- \"\n date,datetime\n 2022-01-02,2022-01-02 05:12\n\"\nread_csv(csv)\n#> # A tibble: 1 × 2\n#> date datetime \n#> <date> <dttm> \n#> 1 2022-01-02 2022-01-02 05:12:00\n\nIf you haven’t heard of ISO8601 before, it’s an international standard2 for writing dates where the components of a date are organized from biggest to smallest separated by -. For example, in ISO8601 May 3 2022 is 2022-05-03. ISO8601 dates can also include times, where hour, minute, and second are separated by :, and the date and time components are separated by either a T or a space. For example, you could write 4:26pm on May 3 2022 as either 2022-05-03 16:26 or 2022-05-03T16:26.\nFor other date-time formats, you’ll need to use col_types plus col_date() or col_datetime() along with a date-time format. The date-time format used by readr is a standard used across many programming languages, describing a date component with a % followed by a single character. For example, %Y-%m-%d specifies a date that’s a year, -, month (as number) -, day. Table Tabela 17.1 lists all the options.\n\n\nTabela 17.1: All date formats understood by readr\n\nType\nCode\nMeaning\nExample\n\n\n\nYear\n%Y\n4 digit year\n2021\n\n\n\n%y\n2 digit year\n21\n\n\nMonth\n%m\nNumber\n2\n\n\n\n%b\nAbbreviated name\nFeb\n\n\n\n%B\nFull name\nFebruary\n\n\nDay\n%d\nOne or two digits\n2\n\n\n\n%e\nTwo digits\n02\n\n\nTime\n%H\n24-hour hour\n13\n\n\n\n%I\n12-hour hour\n1\n\n\n\n%p\nAM/PM\npm\n\n\n\n%M\nMinutes\n35\n\n\n\n%S\nSeconds\n45\n\n\n\n%OS\nSeconds with decimal component\n45.35\n\n\n\n%Z\nTime zone name\nAmerica/Chicago\n\n\n\n%z\nOffset from UTC\n+0800\n\n\nOther\n%.\nSkip one non-digit\n:\n\n\n\n%*\nSkip any number of non-digits\n\n\n\n\n\nAnd this code shows a few options applied to a very ambiguous date:\n\ncsv <- \"\n date\n 01/02/15\n\"\n\nread_csv(csv, col_types = cols(date = col_date(\"%m/%d/%y\")))\n#> # A tibble: 1 × 1\n#> date \n#> <date> \n#> 1 2015-01-02\n\nread_csv(csv, col_types = cols(date = col_date(\"%d/%m/%y\")))\n#> # A tibble: 1 × 1\n#> date \n#> <date> \n#> 1 2015-02-01\n\nread_csv(csv, col_types = cols(date = col_date(\"%y/%m/%d\")))\n#> # A tibble: 1 × 1\n#> date \n#> <date> \n#> 1 2001-02-15\n\nNote that no matter how you specify the date format, it’s always displayed the same way once you get it into R.\nIf you’re using %b or %B and working with non-English dates, you’ll also need to provide a locale(). See the list of built-in languages in date_names_langs(), or create your own with date_names(),\n\n17.2.2 From strings\nThe date-time specification language is powerful, but requires careful analysis of the date format. An alternative approach is to use lubridate’s helpers which attempt to automatically determine the format once you specify the order of the component. To use them, identify the order in which year, month, and day appear in your dates, then arrange “y”, “m”, and “d” in the same order. That gives you the name of the lubridate function that will parse your date. For example:\n\nymd(\"2017-01-31\")\n#> [1] \"2017-01-31\"\nmdy(\"January 31st, 2017\")\n#> [1] \"2017-01-31\"\ndmy(\"31-Jan-2017\")\n#> [1] \"2017-01-31\"\n\nymd() and friends create dates. To create a date-time, add an underscore and one or more of “h”, “m”, and “s” to the name of the parsing function:\n\nymd_hms(\"2017-01-31 20:11:59\")\n#> [1] \"2017-01-31 20:11:59 UTC\"\nmdy_hm(\"01/31/2017 08:01\")\n#> [1] \"2017-01-31 08:01:00 UTC\"\n\nYou can also force the creation of a date-time from a date by supplying a timezone:\n\nymd(\"2017-01-31\", tz = \"UTC\")\n#> [1] \"2017-01-31 UTC\"\n\nHere I use the UTC3 timezone which you might also know as GMT, or Greenwich Mean Time, the time at 0° longitude4 . It doesn’t use daylight saving time, making it a bit easier to compute with .\n\n17.2.3 From individual components\nInstead of a single string, sometimes you’ll have the individual components of the date-time spread across multiple columns. This is what we have in the flights data:\n\nflights |> \n select(year, month, day, hour, minute)\n#> # A tibble: 336,776 × 5\n#> year month day hour minute\n#> <int> <int> <int> <dbl> <dbl>\n#> 1 2013 1 1 5 15\n#> 2 2013 1 1 5 29\n#> 3 2013 1 1 5 40\n#> 4 2013 1 1 5 45\n#> 5 2013 1 1 6 0\n#> 6 2013 1 1 5 58\n#> # ℹ 336,770 more rows\n\nTo create a date/time from this sort of input, use make_date() for dates, or make_datetime() for date-times:\n\nflights |> \n select(year, month, day, hour, minute) |> \n mutate(departure = make_datetime(year, month, day, hour, minute))\n#> # A tibble: 336,776 × 6\n#> year month day hour minute departure \n#> <int> <int> <int> <dbl> <dbl> <dttm> \n#> 1 2013 1 1 5 15 2013-01-01 05:15:00\n#> 2 2013 1 1 5 29 2013-01-01 05:29:00\n#> 3 2013 1 1 5 40 2013-01-01 05:40:00\n#> 4 2013 1 1 5 45 2013-01-01 05:45:00\n#> 5 2013 1 1 6 0 2013-01-01 06:00:00\n#> 6 2013 1 1 5 58 2013-01-01 05:58:00\n#> # ℹ 336,770 more rows\n\nLet’s do the same thing for each of the four time columns in flights. The times are represented in a slightly odd format, so we use modulus arithmetic to pull out the hour and minute components. Once we’ve created the date-time variables, we focus in on the variables we’ll explore in the rest of the chapter.\n\nmake_datetime_100 <- function(year, month, day, time) {\n make_datetime(year, month, day, time %/% 100, time %% 100)\n}\n\nflights_dt <- flights |> \n filter(!is.na(dep_time), !is.na(arr_time)) |> \n mutate(\n dep_time = make_datetime_100(year, month, day, dep_time),\n arr_time = make_datetime_100(year, month, day, arr_time),\n sched_dep_time = make_datetime_100(year, month, day, sched_dep_time),\n sched_arr_time = make_datetime_100(year, month, day, sched_arr_time)\n ) |> \n select(origin, dest, ends_with(\"delay\"), ends_with(\"time\"))\n\nflights_dt\n#> # A tibble: 328,063 × 9\n#> origin dest dep_delay arr_delay dep_time sched_dep_time \n#> <chr> <chr> <dbl> <dbl> <dttm> <dttm> \n#> 1 EWR IAH 2 11 2013-01-01 05:17:00 2013-01-01 05:15:00\n#> 2 LGA IAH 4 20 2013-01-01 05:33:00 2013-01-01 05:29:00\n#> 3 JFK MIA 2 33 2013-01-01 05:42:00 2013-01-01 05:40:00\n#> 4 JFK BQN -1 -18 2013-01-01 05:44:00 2013-01-01 05:45:00\n#> 5 LGA ATL -6 -25 2013-01-01 05:54:00 2013-01-01 06:00:00\n#> 6 EWR ORD -4 12 2013-01-01 05:54:00 2013-01-01 05:58:00\n#> # ℹ 328,057 more rows\n#> # ℹ 3 more variables: arr_time <dttm>, sched_arr_time <dttm>, …\n\nWith this data, we can visualize the distribution of departure times across the year:\n\nflights_dt |> \n ggplot(aes(x = dep_time)) + \n geom_freqpoly(binwidth = 86400) # 86400 seconds = 1 day\n\n\n\n\nOr within a single day:\n\nflights_dt |> \n filter(dep_time < ymd(20130102)) |> \n ggplot(aes(x = dep_time)) + \n geom_freqpoly(binwidth = 600) # 600 s = 10 minutes\n\n\n\n\nNote that when you use date-times in a numeric context (like in a histogram), 1 means 1 second, so a binwidth of 86400 means one day. For dates, 1 means 1 day.\n\n17.2.4 From other types\nYou may want to switch between a date-time and a date. That’s the job of as_datetime() and as_date():\n\nas_datetime(today())\n#> [1] \"2023-11-17 UTC\"\nas_date(now())\n#> [1] \"2023-11-17\"\n\nSometimes you’ll get date/times as numeric offsets from the “Unix Epoch”, 1970-01-01. If the offset is in seconds, use as_datetime(); if it’s in days, use as_date().\n\nas_datetime(60 * 60 * 10)\n#> [1] \"1970-01-01 10:00:00 UTC\"\nas_date(365 * 10 + 2)\n#> [1] \"1980-01-01\"\n\n\n17.2.5 Exercises\n\n\nWhat happens if you parse a string that contains invalid dates?\n\nymd(c(\"2010-10-10\", \"bananas\"))\n\n\nWhat does the tzone argument to today() do? Why is it important?\n\nFor each of the following date-times, show how you’d parse it using a readr column specification and a lubridate function.\n\nd1 <- \"January 1, 2010\"\nd2 <- \"2015-Mar-07\"\nd3 <- \"06-Jun-2017\"\nd4 <- c(\"August 19 (2015)\", \"July 1 (2015)\")\nd5 <- \"12/30/14\" # Dec 30, 2014\nt1 <- \"1705\"\nt2 <- \"11:15:10.12 PM\"" }, { "objectID": "datetimes.html#date-time-components", @@ -1131,14 +1131,14 @@ "href": "databases.html#dbplyr-basics", "title": "21 Databases", "section": "\n21.4 dbplyr basics", - "text": "21.4 dbplyr basics\nNow that we’ve connected to a database and loaded up some data, we can start to learn about dbplyr. dbplyr is a dplyr backend, which means that you keep writing dplyr code but the backend executes it differently. In this, dbplyr translates to SQL; other backends include dtplyr which translates to data.table, and multidplyr which executes your code on multiple cores.\nTo use dbplyr, you must first use tbl() to create an object that represents a database table:\n\ndiamonds_db <- tbl(con, \"diamonds\")\ndiamonds_db\n#> # Source: table<diamonds> [?? x 10]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]\n#> carat cut color clarity depth table price x y z\n#> <dbl> <fct> <fct> <fct> <dbl> <dbl> <int> <dbl> <dbl> <dbl>\n#> 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43\n#> 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31\n#> 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31\n#> 4 0.29 Premium I VS2 62.4 58 334 4.2 4.23 2.63\n#> 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75\n#> 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48\n#> # ℹ more rows\n\n\n\n\n\n\n\nThere are two other common ways to interact with a database. First, many corporate databases are very large so you need some hierarchy to keep all the tables organized. In that case you might need to supply a schema, or a catalog and a schema, in order to pick the table you’re interested in:\n\ndiamonds_db <- tbl(con, in_schema(\"sales\", \"diamonds\"))\ndiamonds_db <- tbl(con, in_catalog(\"north_america\", \"sales\", \"diamonds\"))\n\nOther times you might want to use your own SQL query as a starting point:\n\ndiamonds_db <- tbl(con, sql(\"SELECT * FROM diamonds\"))\n\n\n\n\nThis object is lazy; when you use dplyr verbs on it, dplyr doesn’t do any work: it just records the sequence of operations that you want to perform and only performs them when needed. For example, take the following pipeline:\n\nbig_diamonds_db <- diamonds_db |> \n filter(price > 15000) |> \n select(carat:clarity, price)\n\nbig_diamonds_db\n#> # Source: SQL [?? x 5]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]\n#> carat cut color clarity price\n#> <dbl> <fct> <fct> <fct> <int>\n#> 1 1.54 Premium E VS2 15002\n#> 2 1.19 Ideal F VVS1 15005\n#> 3 2.1 Premium I SI1 15007\n#> 4 1.69 Ideal D SI1 15011\n#> 5 1.5 Very Good G VVS2 15013\n#> 6 1.73 Very Good G VS1 15014\n#> # ℹ more rows\n\nYou can tell this object represents a database query because it prints the DBMS name at the top, and while it tells you the number of columns, it typically doesn’t know the number of rows. This is because finding the total number of rows usually requires executing the complete query, something we’re trying to avoid.\nYou can see the SQL code generated by the dplyr function show_query(). If you know dplyr, this is a great way to learn SQL! Write some dplyr code, get dbplyr to translate it to SQL, and then try to figure out how the two languages match up.\n\nbig_diamonds_db |>\n show_query()\n#> <SQL>\n#> SELECT carat, cut, color, clarity, price\n#> FROM diamonds\n#> WHERE (price > 15000.0)\n\nTo get all the data back into R, you call collect(). Behind the scenes, this generates the SQL, calls dbGetQuery() to get the data, then turns the result into a tibble:\n\nbig_diamonds <- big_diamonds_db |> \n collect()\nbig_diamonds\n#> # A tibble: 1,655 × 5\n#> carat cut color clarity price\n#> <dbl> <fct> <fct> <fct> <int>\n#> 1 1.54 Premium E VS2 15002\n#> 2 1.19 Ideal F VVS1 15005\n#> 3 2.1 Premium I SI1 15007\n#> 4 1.69 Ideal D SI1 15011\n#> 5 1.5 Very Good G VVS2 15013\n#> 6 1.73 Very Good G VS1 15014\n#> # ℹ 1,649 more rows\n\nTypically, you’ll use dbplyr to select the data you want from the database, performing basic filtering and aggregation using the translations described below. Then, once you’re ready to analyse the data with functions that are unique to R, you’ll collect() the data to get an in-memory tibble, and continue your work with pure R code." + "text": "21.4 dbplyr basics\nNow that we’ve connected to a database and loaded up some data, we can start to learn about dbplyr. dbplyr is a dplyr backend, which means that you keep writing dplyr code but the backend executes it differently. In this, dbplyr translates to SQL; other backends include dtplyr which translates to data.table, and multidplyr which executes your code on multiple cores.\nTo use dbplyr, you must first use tbl() to create an object that represents a database table:\n\ndiamonds_db <- tbl(con, \"diamonds\")\ndiamonds_db\n#> # Source: table<diamonds> [?? x 10]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]\n#> carat cut color clarity depth table price x y z\n#> <dbl> <fct> <fct> <fct> <dbl> <dbl> <int> <dbl> <dbl> <dbl>\n#> 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43\n#> 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31\n#> 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31\n#> 4 0.29 Premium I VS2 62.4 58 334 4.2 4.23 2.63\n#> 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75\n#> 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48\n#> # ℹ more rows\n\n\n\n\n\n\n\nThere are two other common ways to interact with a database. First, many corporate databases are very large so you need some hierarchy to keep all the tables organized. In that case you might need to supply a schema, or a catalog and a schema, in order to pick the table you’re interested in:\n\ndiamonds_db <- tbl(con, in_schema(\"sales\", \"diamonds\"))\ndiamonds_db <- tbl(con, in_catalog(\"north_america\", \"sales\", \"diamonds\"))\n\nOther times you might want to use your own SQL query as a starting point:\n\ndiamonds_db <- tbl(con, sql(\"SELECT * FROM diamonds\"))\n\n\n\n\nThis object is lazy; when you use dplyr verbs on it, dplyr doesn’t do any work: it just records the sequence of operations that you want to perform and only performs them when needed. For example, take the following pipeline:\n\nbig_diamonds_db <- diamonds_db |> \n filter(price > 15000) |> \n select(carat:clarity, price)\n\nbig_diamonds_db\n#> # Source: SQL [?? x 5]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]\n#> carat cut color clarity price\n#> <dbl> <fct> <fct> <fct> <int>\n#> 1 1.54 Premium E VS2 15002\n#> 2 1.19 Ideal F VVS1 15005\n#> 3 2.1 Premium I SI1 15007\n#> 4 1.69 Ideal D SI1 15011\n#> 5 1.5 Very Good G VVS2 15013\n#> 6 1.73 Very Good G VS1 15014\n#> # ℹ more rows\n\nYou can tell this object represents a database query because it prints the DBMS name at the top, and while it tells you the number of columns, it typically doesn’t know the number of rows. This is because finding the total number of rows usually requires executing the complete query, something we’re trying to avoid.\nYou can see the SQL code generated by the dplyr function show_query(). If you know dplyr, this is a great way to learn SQL! Write some dplyr code, get dbplyr to translate it to SQL, and then try to figure out how the two languages match up.\n\nbig_diamonds_db |>\n show_query()\n#> <SQL>\n#> SELECT carat, cut, color, clarity, price\n#> FROM diamonds\n#> WHERE (price > 15000.0)\n\nTo get all the data back into R, you call collect(). Behind the scenes, this generates the SQL, calls dbGetQuery() to get the data, then turns the result into a tibble:\n\nbig_diamonds <- big_diamonds_db |> \n collect()\nbig_diamonds\n#> # A tibble: 1,655 × 5\n#> carat cut color clarity price\n#> <dbl> <fct> <fct> <fct> <int>\n#> 1 1.54 Premium E VS2 15002\n#> 2 1.19 Ideal F VVS1 15005\n#> 3 2.1 Premium I SI1 15007\n#> 4 1.69 Ideal D SI1 15011\n#> 5 1.5 Very Good G VVS2 15013\n#> 6 1.73 Very Good G VS1 15014\n#> # ℹ 1,649 more rows\n\nTypically, you’ll use dbplyr to select the data you want from the database, performing basic filtering and aggregation using the translations described below. Then, once you’re ready to analyse the data with functions that are unique to R, you’ll collect() the data to get an in-memory tibble, and continue your work with pure R code." }, { "objectID": "databases.html#sql", "href": "databases.html#sql", "title": "21 Databases", "section": "\n21.5 SQL", - "text": "21.5 SQL\nThe rest of the chapter will teach you a little SQL through the lens of dbplyr. It’s a rather non-traditional introduction to SQL but we hope it will get you quickly up to speed with the basics. Luckily, if you understand dplyr you’re in a great place to quickly pick up SQL because so many of the concepts are the same.\nWe’ll explore the relationship between dplyr and SQL using a couple of old friends from the nycflights13 package: flights and planes. These datasets are easy to get into our learning database because dbplyr comes with a function that copies the tables from nycflights13 to our database:\n\ndbplyr::copy_nycflights13(con)\n#> Creating table: airlines\n#> Creating table: airports\n#> Creating table: flights\n#> Creating table: planes\n#> Creating table: weather\nflights <- tbl(con, \"flights\")\nplanes <- tbl(con, \"planes\")\n\n\n21.5.1 SQL basics\nThe top-level components of SQL are called statements. Common statements include CREATE for defining new tables, INSERT for adding data, and SELECT for retrieving data. We will focus on SELECT statements, also called queries, because they are almost exclusively what you’ll use as a data scientist.\nA query is made up of clauses. There are five important clauses: SELECT, FROM, WHERE, ORDER BY, and GROUP BY. Every query must have the SELECT4 and FROM5 clauses and the simplest query is SELECT * FROM table, which selects all columns from the specified table . This is what dbplyr generates for an unadulterated table :\n\nflights |> show_query()\n#> <SQL>\n#> SELECT *\n#> FROM flights\nplanes |> show_query()\n#> <SQL>\n#> SELECT *\n#> FROM planes\n\nWHERE and ORDER BY control which rows are included and how they are ordered:\n\nflights |> \n filter(dest == \"IAH\") |> \n arrange(dep_delay) |>\n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (dest = 'IAH')\n#> ORDER BY dep_delay\n\nGROUP BY converts the query to a summary, causing aggregation to happen:\n\nflights |> \n group_by(dest) |> \n summarize(dep_delay = mean(dep_delay, na.rm = TRUE)) |> \n show_query()\n#> <SQL>\n#> SELECT dest, AVG(dep_delay) AS dep_delay\n#> FROM flights\n#> GROUP BY dest\n\nThere are two important differences between dplyr verbs and SELECT clauses:\n\nIn SQL, case doesn’t matter: you can write select, SELECT, or even SeLeCt. In this book we’ll stick with the common convention of writing SQL keywords in uppercase to distinguish them from table or variables names.\nIn SQL, order matters: you must always write the clauses in the order SELECT, FROM, WHERE, GROUP BY, ORDER BY. Confusingly, this order doesn’t match how the clauses actually evaluated which is first FROM, then WHERE, GROUP BY, SELECT, and ORDER BY.\n\nThe following sections explore each clause in more detail.\n\n\n\n\n\n\nNote that while SQL is a standard, it is extremely complex and no database follows it exactly. While the main components that we’ll focus on in this book are very similar between DBMS’s, there are many minor variations. Fortunately, dbplyr is designed to handle this problem and generates different translations for different databases. It’s not perfect, but it’s continually improving, and if you hit a problem you can file an issue on GitHub to help us do better.\n\n\n\n\n21.5.2 SELECT\nThe SELECT clause is the workhorse of queries and performs the same job as select(), mutate(), rename(), relocate(), and, as you’ll learn in the next section, summarize().\nselect(), rename(), and relocate() have very direct translations to SELECT as they just affect where a column appears (if at all) along with its name:\n\nplanes |> \n select(tailnum, type, manufacturer, model, year) |> \n show_query()\n#> <SQL>\n#> SELECT tailnum, \"type\", manufacturer, model, \"year\"\n#> FROM planes\n\nplanes |> \n select(tailnum, type, manufacturer, model, year) |> \n rename(year_built = year) |> \n show_query()\n#> <SQL>\n#> SELECT tailnum, \"type\", manufacturer, model, \"year\" AS year_built\n#> FROM planes\n\nplanes |> \n select(tailnum, type, manufacturer, model, year) |> \n relocate(manufacturer, model, .before = type) |> \n show_query()\n#> <SQL>\n#> SELECT tailnum, manufacturer, model, \"type\", \"year\"\n#> FROM planes\n\nThis example also shows you how SQL does renaming. In SQL terminology renaming is called aliasing and is done with AS. Note that unlike mutate(), the old name is on the left and the new name is on the right.\n\n\n\n\n\n\nIn the examples above note that \"year\" and \"type\" are wrapped in double quotes. That’s because these are reserved words in duckdb, so dbplyr quotes them to avoid any potential confusion between column/table names and SQL operators.\nWhen working with other databases you’re likely to see every variable name quotes because only a handful of client packages, like duckdb, know what all the reserved words are, so they quote everything to be safe.\nSELECT \"tailnum\", \"type\", \"manufacturer\", \"model\", \"year\"\nFROM \"planes\"\nSome other database systems use backticks instead of quotes:\nSELECT `tailnum`, `type`, `manufacturer`, `model`, `year`\nFROM `planes`\n\n\n\nThe translations for mutate() are similarly straightforward: each variable becomes a new expression in SELECT:\n\nflights |> \n mutate(\n speed = distance / (air_time / 60)\n ) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*, distance / (air_time / 60.0) AS speed\n#> FROM flights\n\nWe’ll come back to the translation of individual components (like /) in Seção 21.6.\n\n21.5.3 FROM\nThe FROM clause defines the data source. It’s going to be rather uninteresting for a little while, because we’re just using single tables. You’ll see more complex examples once we hit the join functions.\n\n21.5.4 GROUP BY\ngroup_by() is translated to the GROUP BY6 clause and summarize() is translated to the SELECT clause:\n\ndiamonds_db |> \n group_by(cut) |> \n summarize(\n n = n(),\n avg_price = mean(price, na.rm = TRUE)\n ) |> \n show_query()\n#> <SQL>\n#> SELECT cut, COUNT(*) AS n, AVG(price) AS avg_price\n#> FROM diamonds\n#> GROUP BY cut\n\nWe’ll come back to what’s happening with translation n() and mean() in Seção 21.6.\n\n21.5.5 WHERE\nfilter() is translated to the WHERE clause:\n\nflights |> \n filter(dest == \"IAH\" | dest == \"HOU\") |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (dest = 'IAH' OR dest = 'HOU')\n\nflights |> \n filter(arr_delay > 0 & arr_delay < 20) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (arr_delay > 0.0 AND arr_delay < 20.0)\n\nThere are a few important details to note here:\n\n\n| becomes OR and & becomes AND.\nSQL uses = for comparison, not ==. SQL doesn’t have assignment, so there’s no potential for confusion there.\nSQL uses only '' for strings, not \"\". In SQL, \"\" is used to identify variables, like R’s ``.\n\nAnother useful SQL operator is IN, which is very close to R’s %in%:\n\nflights |> \n filter(dest %in% c(\"IAH\", \"HOU\")) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (dest IN ('IAH', 'HOU'))\n\nSQL uses NULL instead of NA. NULLs behave similarly to NAs. The main difference is that while they’re “infectious” in comparisons and arithmetic, they are silently dropped when summarizing. dbplyr will remind you about this behavior the first time you hit it:\n\nflights |> \n group_by(dest) |> \n summarize(delay = mean(arr_delay))\n#> Warning: Missing values are always removed in SQL aggregation functions.\n#> Use `na.rm = TRUE` to silence this warning\n#> This warning is displayed once every 8 hours.\n#> # Source: SQL [?? x 2]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]\n#> dest delay\n#> <chr> <dbl>\n#> 1 SFO 2.67\n#> 2 SJU 2.52\n#> 3 SNA -7.87\n#> 4 SRQ 3.08\n#> 5 CHS 10.6 \n#> 6 SAN 3.14\n#> # ℹ more rows\n\nIf you want to learn more about how NULLs work, you might enjoy “Three valued logic” by Markus Winand.\nIn general, you can work with NULLs using the functions you’d use for NAs in R:\n\nflights |> \n filter(!is.na(dep_delay)) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (NOT((dep_delay IS NULL)))\n\nThis SQL query illustrates one of the drawbacks of dbplyr: while the SQL is correct, it isn’t as simple as you might write by hand. In this case, you could drop the parentheses and use a special operator that’s easier to read:\nWHERE \"dep_delay\" IS NOT NULL\nNote that if you filter() a variable that you created using a summarize, dbplyr will generate a HAVING clause, rather than a WHERE clause. This is a one of the idiosyncrasies of SQL: WHERE is evaluated before SELECT and GROUP BY, so SQL needs another clause that’s evaluated afterwards.\n\ndiamonds_db |> \n group_by(cut) |> \n summarize(n = n()) |> \n filter(n > 100) |> \n show_query()\n#> <SQL>\n#> SELECT cut, COUNT(*) AS n\n#> FROM diamonds\n#> GROUP BY cut\n#> HAVING (COUNT(*) > 100.0)\n\n\n21.5.6 ORDER BY\nOrdering rows involves a straightforward translation from arrange() to the ORDER BY clause:\n\nflights |> \n arrange(year, month, day, desc(dep_delay)) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> ORDER BY \"year\", \"month\", \"day\", dep_delay DESC\n\nNotice how desc() is translated to DESC: this is one of the many dplyr functions whose name was directly inspired by SQL.\n\n21.5.7 Subqueries\nSometimes it’s not possible to translate a dplyr pipeline into a single SELECT statement and you need to use a subquery. A subquery is just a query used as a data source in the FROM clause, instead of the usual table.\ndbplyr typically uses subqueries to work around limitations of SQL. For example, expressions in the SELECT clause can’t refer to columns that were just created. That means that the following (silly) dplyr pipeline needs to happen in two steps: the first (inner) query computes year1 and then the second (outer) query can compute year2.\n\nflights |> \n mutate(\n year1 = year + 1,\n year2 = year1 + 1\n ) |> \n show_query()\n#> <SQL>\n#> SELECT q01.*, year1 + 1.0 AS year2\n#> FROM (\n#> SELECT flights.*, \"year\" + 1.0 AS year1\n#> FROM flights\n#> ) q01\n\nYou’ll also see this if you attempted to filter() a variable that you just created. Remember, even though WHERE is written after SELECT, it’s evaluated before it, so we need a subquery in this (silly) example:\n\nflights |> \n mutate(year1 = year + 1) |> \n filter(year1 == 2014) |> \n show_query()\n#> <SQL>\n#> SELECT q01.*\n#> FROM (\n#> SELECT flights.*, \"year\" + 1.0 AS year1\n#> FROM flights\n#> ) q01\n#> WHERE (year1 = 2014.0)\n\nSometimes dbplyr will create a subquery where it’s not needed because it doesn’t yet know how to optimize that translation. As dbplyr improves over time, these cases will get rarer but will probably never go away.\n\n21.5.8 Joins\nIf you’re familiar with dplyr’s joins, SQL joins are very similar. Here’s a simple example:\n\nflights |> \n left_join(planes |> rename(year_built = year), by = \"tailnum\") |> \n show_query()\n#> <SQL>\n#> SELECT\n#> flights.*,\n#> planes.\"year\" AS year_built,\n#> \"type\",\n#> manufacturer,\n#> model,\n#> engines,\n#> seats,\n#> speed,\n#> engine\n#> FROM flights\n#> LEFT JOIN planes\n#> ON (flights.tailnum = planes.tailnum)\n\nThe main thing to notice here is the syntax: SQL joins use sub-clauses of the FROM clause to bring in additional tables, using ON to define how the tables are related.\ndplyr’s names for these functions are so closely connected to SQL that you can easily guess the equivalent SQL for inner_join(), right_join(), and full_join():\nSELECT flights.*, \"type\", manufacturer, model, engines, seats, speed\nFROM flights\nINNER JOIN planes ON (flights.tailnum = planes.tailnum)\n\nSELECT flights.*, \"type\", manufacturer, model, engines, seats, speed\nFROM flights\nRIGHT JOIN planes ON (flights.tailnum = planes.tailnum)\n\nSELECT flights.*, \"type\", manufacturer, model, engines, seats, speed\nFROM flights\nFULL JOIN planes ON (flights.tailnum = planes.tailnum)\nYou’re likely to need many joins when working with data from a database. That’s because database tables are often stored in a highly normalized form, where each “fact” is stored in a single place and to keep a complete dataset for analysis you need to navigate a complex network of tables connected by primary and foreign keys. If you hit this scenario, the dm package, by Tobias Schieferdecker, Kirill Müller, and Darko Bergant, is a life saver. It can automatically determine the connections between tables using the constraints that DBAs often supply, visualize the connections so you can see what’s going on, and generate the joins you need to connect one table to another.\n\n21.5.9 Other verbs\ndbplyr also translates other verbs like distinct(), slice_*(), and intersect(), and a growing selection of tidyr functions like pivot_longer() and pivot_wider(). The easiest way to see the full set of what’s currently available is to visit the dbplyr website: https://dbplyr.tidyverse.org/reference/.\n\n21.5.10 Exercises\n\nWhat is distinct() translated to? How about head()?\n\nExplain what each of the following SQL queries do and try recreate them using dbplyr.\nSELECT * \nFROM flights\nWHERE dep_delay < arr_delay\n\nSELECT *, distance / (air_time / 60) AS speed\nFROM flights" + "text": "21.5 SQL\nThe rest of the chapter will teach you a little SQL through the lens of dbplyr. It’s a rather non-traditional introduction to SQL but we hope it will get you quickly up to speed with the basics. Luckily, if you understand dplyr you’re in a great place to quickly pick up SQL because so many of the concepts are the same.\nWe’ll explore the relationship between dplyr and SQL using a couple of old friends from the nycflights13 package: flights and planes. These datasets are easy to get into our learning database because dbplyr comes with a function that copies the tables from nycflights13 to our database:\n\ndbplyr::copy_nycflights13(con)\n#> Creating table: airlines\n#> Creating table: airports\n#> Creating table: flights\n#> Creating table: planes\n#> Creating table: weather\nflights <- tbl(con, \"flights\")\nplanes <- tbl(con, \"planes\")\n\n\n21.5.1 SQL basics\nThe top-level components of SQL are called statements. Common statements include CREATE for defining new tables, INSERT for adding data, and SELECT for retrieving data. We will focus on SELECT statements, also called queries, because they are almost exclusively what you’ll use as a data scientist.\nA query is made up of clauses. There are five important clauses: SELECT, FROM, WHERE, ORDER BY, and GROUP BY. Every query must have the SELECT4 and FROM5 clauses and the simplest query is SELECT * FROM table, which selects all columns from the specified table . This is what dbplyr generates for an unadulterated table :\n\nflights |> show_query()\n#> <SQL>\n#> SELECT *\n#> FROM flights\nplanes |> show_query()\n#> <SQL>\n#> SELECT *\n#> FROM planes\n\nWHERE and ORDER BY control which rows are included and how they are ordered:\n\nflights |> \n filter(dest == \"IAH\") |> \n arrange(dep_delay) |>\n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (dest = 'IAH')\n#> ORDER BY dep_delay\n\nGROUP BY converts the query to a summary, causing aggregation to happen:\n\nflights |> \n group_by(dest) |> \n summarize(dep_delay = mean(dep_delay, na.rm = TRUE)) |> \n show_query()\n#> <SQL>\n#> SELECT dest, AVG(dep_delay) AS dep_delay\n#> FROM flights\n#> GROUP BY dest\n\nThere are two important differences between dplyr verbs and SELECT clauses:\n\nIn SQL, case doesn’t matter: you can write select, SELECT, or even SeLeCt. In this book we’ll stick with the common convention of writing SQL keywords in uppercase to distinguish them from table or variables names.\nIn SQL, order matters: you must always write the clauses in the order SELECT, FROM, WHERE, GROUP BY, ORDER BY. Confusingly, this order doesn’t match how the clauses actually evaluated which is first FROM, then WHERE, GROUP BY, SELECT, and ORDER BY.\n\nThe following sections explore each clause in more detail.\n\n\n\n\n\n\nNote that while SQL is a standard, it is extremely complex and no database follows it exactly. While the main components that we’ll focus on in this book are very similar between DBMS’s, there are many minor variations. Fortunately, dbplyr is designed to handle this problem and generates different translations for different databases. It’s not perfect, but it’s continually improving, and if you hit a problem you can file an issue on GitHub to help us do better.\n\n\n\n\n21.5.2 SELECT\nThe SELECT clause is the workhorse of queries and performs the same job as select(), mutate(), rename(), relocate(), and, as you’ll learn in the next section, summarize().\nselect(), rename(), and relocate() have very direct translations to SELECT as they just affect where a column appears (if at all) along with its name:\n\nplanes |> \n select(tailnum, type, manufacturer, model, year) |> \n show_query()\n#> <SQL>\n#> SELECT tailnum, \"type\", manufacturer, model, \"year\"\n#> FROM planes\n\nplanes |> \n select(tailnum, type, manufacturer, model, year) |> \n rename(year_built = year) |> \n show_query()\n#> <SQL>\n#> SELECT tailnum, \"type\", manufacturer, model, \"year\" AS year_built\n#> FROM planes\n\nplanes |> \n select(tailnum, type, manufacturer, model, year) |> \n relocate(manufacturer, model, .before = type) |> \n show_query()\n#> <SQL>\n#> SELECT tailnum, manufacturer, model, \"type\", \"year\"\n#> FROM planes\n\nThis example also shows you how SQL does renaming. In SQL terminology renaming is called aliasing and is done with AS. Note that unlike mutate(), the old name is on the left and the new name is on the right.\n\n\n\n\n\n\nIn the examples above note that \"year\" and \"type\" are wrapped in double quotes. That’s because these are reserved words in duckdb, so dbplyr quotes them to avoid any potential confusion between column/table names and SQL operators.\nWhen working with other databases you’re likely to see every variable name quotes because only a handful of client packages, like duckdb, know what all the reserved words are, so they quote everything to be safe.\nSELECT \"tailnum\", \"type\", \"manufacturer\", \"model\", \"year\"\nFROM \"planes\"\nSome other database systems use backticks instead of quotes:\nSELECT `tailnum`, `type`, `manufacturer`, `model`, `year`\nFROM `planes`\n\n\n\nThe translations for mutate() are similarly straightforward: each variable becomes a new expression in SELECT:\n\nflights |> \n mutate(\n speed = distance / (air_time / 60)\n ) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*, distance / (air_time / 60.0) AS speed\n#> FROM flights\n\nWe’ll come back to the translation of individual components (like /) in Seção 21.6.\n\n21.5.3 FROM\nThe FROM clause defines the data source. It’s going to be rather uninteresting for a little while, because we’re just using single tables. You’ll see more complex examples once we hit the join functions.\n\n21.5.4 GROUP BY\ngroup_by() is translated to the GROUP BY6 clause and summarize() is translated to the SELECT clause:\n\ndiamonds_db |> \n group_by(cut) |> \n summarize(\n n = n(),\n avg_price = mean(price, na.rm = TRUE)\n ) |> \n show_query()\n#> <SQL>\n#> SELECT cut, COUNT(*) AS n, AVG(price) AS avg_price\n#> FROM diamonds\n#> GROUP BY cut\n\nWe’ll come back to what’s happening with translation n() and mean() in Seção 21.6.\n\n21.5.5 WHERE\nfilter() is translated to the WHERE clause:\n\nflights |> \n filter(dest == \"IAH\" | dest == \"HOU\") |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (dest = 'IAH' OR dest = 'HOU')\n\nflights |> \n filter(arr_delay > 0 & arr_delay < 20) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (arr_delay > 0.0 AND arr_delay < 20.0)\n\nThere are a few important details to note here:\n\n\n| becomes OR and & becomes AND.\nSQL uses = for comparison, not ==. SQL doesn’t have assignment, so there’s no potential for confusion there.\nSQL uses only '' for strings, not \"\". In SQL, \"\" is used to identify variables, like R’s ``.\n\nAnother useful SQL operator is IN, which is very close to R’s %in%:\n\nflights |> \n filter(dest %in% c(\"IAH\", \"HOU\")) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (dest IN ('IAH', 'HOU'))\n\nSQL uses NULL instead of NA. NULLs behave similarly to NAs. The main difference is that while they’re “infectious” in comparisons and arithmetic, they are silently dropped when summarizing. dbplyr will remind you about this behavior the first time you hit it:\n\nflights |> \n group_by(dest) |> \n summarize(delay = mean(arr_delay))\n#> Warning: Missing values are always removed in SQL aggregation functions.\n#> Use `na.rm = TRUE` to silence this warning\n#> This warning is displayed once every 8 hours.\n#> # Source: SQL [?? x 2]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]\n#> dest delay\n#> <chr> <dbl>\n#> 1 SFO 2.67\n#> 2 SJU 2.52\n#> 3 SNA -7.87\n#> 4 SRQ 3.08\n#> 5 CHS 10.6 \n#> 6 SAN 3.14\n#> # ℹ more rows\n\nIf you want to learn more about how NULLs work, you might enjoy “Three valued logic” by Markus Winand.\nIn general, you can work with NULLs using the functions you’d use for NAs in R:\n\nflights |> \n filter(!is.na(dep_delay)) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> WHERE (NOT((dep_delay IS NULL)))\n\nThis SQL query illustrates one of the drawbacks of dbplyr: while the SQL is correct, it isn’t as simple as you might write by hand. In this case, you could drop the parentheses and use a special operator that’s easier to read:\nWHERE \"dep_delay\" IS NOT NULL\nNote that if you filter() a variable that you created using a summarize, dbplyr will generate a HAVING clause, rather than a WHERE clause. This is a one of the idiosyncrasies of SQL: WHERE is evaluated before SELECT and GROUP BY, so SQL needs another clause that’s evaluated afterwards.\n\ndiamonds_db |> \n group_by(cut) |> \n summarize(n = n()) |> \n filter(n > 100) |> \n show_query()\n#> <SQL>\n#> SELECT cut, COUNT(*) AS n\n#> FROM diamonds\n#> GROUP BY cut\n#> HAVING (COUNT(*) > 100.0)\n\n\n21.5.6 ORDER BY\nOrdering rows involves a straightforward translation from arrange() to the ORDER BY clause:\n\nflights |> \n arrange(year, month, day, desc(dep_delay)) |> \n show_query()\n#> <SQL>\n#> SELECT flights.*\n#> FROM flights\n#> ORDER BY \"year\", \"month\", \"day\", dep_delay DESC\n\nNotice how desc() is translated to DESC: this is one of the many dplyr functions whose name was directly inspired by SQL.\n\n21.5.7 Subqueries\nSometimes it’s not possible to translate a dplyr pipeline into a single SELECT statement and you need to use a subquery. A subquery is just a query used as a data source in the FROM clause, instead of the usual table.\ndbplyr typically uses subqueries to work around limitations of SQL. For example, expressions in the SELECT clause can’t refer to columns that were just created. That means that the following (silly) dplyr pipeline needs to happen in two steps: the first (inner) query computes year1 and then the second (outer) query can compute year2.\n\nflights |> \n mutate(\n year1 = year + 1,\n year2 = year1 + 1\n ) |> \n show_query()\n#> <SQL>\n#> SELECT q01.*, year1 + 1.0 AS year2\n#> FROM (\n#> SELECT flights.*, \"year\" + 1.0 AS year1\n#> FROM flights\n#> ) q01\n\nYou’ll also see this if you attempted to filter() a variable that you just created. Remember, even though WHERE is written after SELECT, it’s evaluated before it, so we need a subquery in this (silly) example:\n\nflights |> \n mutate(year1 = year + 1) |> \n filter(year1 == 2014) |> \n show_query()\n#> <SQL>\n#> SELECT q01.*\n#> FROM (\n#> SELECT flights.*, \"year\" + 1.0 AS year1\n#> FROM flights\n#> ) q01\n#> WHERE (year1 = 2014.0)\n\nSometimes dbplyr will create a subquery where it’s not needed because it doesn’t yet know how to optimize that translation. As dbplyr improves over time, these cases will get rarer but will probably never go away.\n\n21.5.8 Joins\nIf you’re familiar with dplyr’s joins, SQL joins are very similar. Here’s a simple example:\n\nflights |> \n left_join(planes |> rename(year_built = year), by = \"tailnum\") |> \n show_query()\n#> <SQL>\n#> SELECT\n#> flights.*,\n#> planes.\"year\" AS year_built,\n#> \"type\",\n#> manufacturer,\n#> model,\n#> engines,\n#> seats,\n#> speed,\n#> engine\n#> FROM flights\n#> LEFT JOIN planes\n#> ON (flights.tailnum = planes.tailnum)\n\nThe main thing to notice here is the syntax: SQL joins use sub-clauses of the FROM clause to bring in additional tables, using ON to define how the tables are related.\ndplyr’s names for these functions are so closely connected to SQL that you can easily guess the equivalent SQL for inner_join(), right_join(), and full_join():\nSELECT flights.*, \"type\", manufacturer, model, engines, seats, speed\nFROM flights\nINNER JOIN planes ON (flights.tailnum = planes.tailnum)\n\nSELECT flights.*, \"type\", manufacturer, model, engines, seats, speed\nFROM flights\nRIGHT JOIN planes ON (flights.tailnum = planes.tailnum)\n\nSELECT flights.*, \"type\", manufacturer, model, engines, seats, speed\nFROM flights\nFULL JOIN planes ON (flights.tailnum = planes.tailnum)\nYou’re likely to need many joins when working with data from a database. That’s because database tables are often stored in a highly normalized form, where each “fact” is stored in a single place and to keep a complete dataset for analysis you need to navigate a complex network of tables connected by primary and foreign keys. If you hit this scenario, the dm package, by Tobias Schieferdecker, Kirill Müller, and Darko Bergant, is a life saver. It can automatically determine the connections between tables using the constraints that DBAs often supply, visualize the connections so you can see what’s going on, and generate the joins you need to connect one table to another.\n\n21.5.9 Other verbs\ndbplyr also translates other verbs like distinct(), slice_*(), and intersect(), and a growing selection of tidyr functions like pivot_longer() and pivot_wider(). The easiest way to see the full set of what’s currently available is to visit the dbplyr website: https://dbplyr.tidyverse.org/reference/.\n\n21.5.10 Exercises\n\nWhat is distinct() translated to? How about head()?\n\nExplain what each of the following SQL queries do and try recreate them using dbplyr.\nSELECT * \nFROM flights\nWHERE dep_delay < arr_delay\n\nSELECT *, distance / (air_time / 60) AS speed\nFROM flights" }, { "objectID": "databases.html#sec-sql-expressions", @@ -1390,7 +1390,7 @@ "href": "iteration.html#saving-multiple-outputs", "title": "26 Iteration", "section": "\n26.4 Saving multiple outputs", - "text": "26.4 Saving multiple outputs\nIn the last section, you learned about map(), which is useful for reading multiple files into a single object. In this section, we’ll now explore sort of the opposite problem: how can you take one or more R objects and save it to one or more files? We’ll explore this challenge using three examples:\n\nSaving multiple data frames into one database.\nSaving multiple data frames into multiple .csv files.\nSaving multiple plots to multiple .png files.\n\n\n26.4.1 Writing to a database\nSometimes when working with many files at once, it’s not possible to fit all your data into memory at once, and you can’t do map(files, read_csv). One approach to deal with this problem is to load your data into a database so you can access just the bits you need with dbplyr.\nIf you’re lucky, the database package you’re using will provide a handy function that takes a vector of paths and loads them all into the database. This is the case with duckdb’s duckdb_read_csv():\n\ncon <- DBI::dbConnect(duckdb::duckdb())\nduckdb::duckdb_read_csv(con, \"gapminder\", paths)\n\nThis would work well here, but we don’t have csv files, instead we have excel spreadsheets. So we’re going to have to do it “by hand”. Learning to do it by hand will also help you when you have a bunch of csvs and the database that you’re working with doesn’t have one function that will load them all in.\nWe need to start by creating a table that will fill in with data. The easiest way to do this is by creating a template, a dummy data frame that contains all the columns we want, but only a sampling of the data. For the gapminder data, we can make that template by reading a single file and adding the year to it:\n\ntemplate <- readxl::read_excel(paths[[1]])\ntemplate$year <- 1952\ntemplate\n#> # A tibble: 142 × 6\n#> country continent lifeExp pop gdpPercap year\n#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>\n#> 1 Afghanistan Asia 28.8 8425333 779. 1952\n#> 2 Albania Europe 55.2 1282697 1601. 1952\n#> 3 Algeria Africa 43.1 9279525 2449. 1952\n#> 4 Angola Africa 30.0 4232095 3521. 1952\n#> 5 Argentina Americas 62.5 17876956 5911. 1952\n#> 6 Australia Oceania 69.1 8691212 10040. 1952\n#> # ℹ 136 more rows\n\nNow we can connect to the database, and use DBI::dbCreateTable() to turn our template into a database table:\n\ncon <- DBI::dbConnect(duckdb::duckdb())\nDBI::dbCreateTable(con, \"gapminder\", template)\n\ndbCreateTable() doesn’t use the data in template, just the variable names and types. So if we inspect the gapminder table now you’ll see that it’s empty but it has the variables we need with the types we expect:\n\ncon |> tbl(\"gapminder\")\n#> # Source: table<gapminder> [0 x 6]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]\n#> # ℹ 6 variables: country <chr>, continent <chr>, lifeExp <dbl>, pop <dbl>,\n#> # gdpPercap <dbl>, year <dbl>\n\nNext, we need a function that takes a single file path, reads it into R, and adds the result to the gapminder table. We can do that by combining read_excel() with DBI::dbAppendTable():\n\nappend_file <- function(path) {\n df <- readxl::read_excel(path)\n df$year <- parse_number(basename(path))\n \n DBI::dbAppendTable(con, \"gapminder\", df)\n}\n\nNow we need to call append_file() once for each element of paths. That’s certainly possible with map():\n\npaths |> map(append_file)\n\nBut we don’t care about the output of append_file(), so instead of map() it’s slightly nicer to use walk(). walk() does exactly the same thing as map() but throws the output away:\n\npaths |> walk(append_file)\n\nNow we can see if we have all the data in our table:\n\ncon |> \n tbl(\"gapminder\") |> \n count(year)\n#> # Source: SQL [?? x 2]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1016-azure:R 4.3.2/:memory:]\n#> year n\n#> <dbl> <dbl>\n#> 1 1967 142\n#> 2 1977 142\n#> 3 1987 142\n#> 4 2007 142\n#> 5 1952 142\n#> 6 1957 142\n#> # ℹ more rows\n\n\n26.4.2 Writing csv files\nThe same basic principle applies if we want to write multiple csv files, one for each group. Let’s imagine that we want to take the ggplot2::diamonds data and save one csv file for each clarity. First we need to make those individual datasets. There are many ways you could do that, but there’s one way we particularly like: group_nest().\n\nby_clarity <- diamonds |> \n group_nest(clarity)\n\nby_clarity\n#> # A tibble: 8 × 2\n#> clarity data\n#> <ord> <list<tibble[,9]>>\n#> 1 I1 [741 × 9]\n#> 2 SI2 [9,194 × 9]\n#> 3 SI1 [13,065 × 9]\n#> 4 VS2 [12,258 × 9]\n#> 5 VS1 [8,171 × 9]\n#> 6 VVS2 [5,066 × 9]\n#> # ℹ 2 more rows\n\nThis gives us a new tibble with eight rows and two columns. clarity is our grouping variable and data is a list-column containing one tibble for each unique value of clarity:\n\nby_clarity$data[[1]]\n#> # A tibble: 741 × 9\n#> carat cut color depth table price x y z\n#> <dbl> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>\n#> 1 0.32 Premium E 60.9 58 345 4.38 4.42 2.68\n#> 2 1.17 Very Good J 60.2 61 2774 6.83 6.9 4.13\n#> 3 1.01 Premium F 61.8 60 2781 6.39 6.36 3.94\n#> 4 1.01 Fair E 64.5 58 2788 6.29 6.21 4.03\n#> 5 0.96 Ideal F 60.7 55 2801 6.37 6.41 3.88\n#> 6 1.04 Premium G 62.2 58 2801 6.46 6.41 4 \n#> # ℹ 735 more rows\n\nWhile we’re here, let’s create a column that gives the name of output file, using mutate() and str_glue():\n\nby_clarity <- by_clarity |> \n mutate(path = str_glue(\"diamonds-{clarity}.csv\"))\n\nby_clarity\n#> # A tibble: 8 × 3\n#> clarity data path \n#> <ord> <list<tibble[,9]>> <glue> \n#> 1 I1 [741 × 9] diamonds-I1.csv \n#> 2 SI2 [9,194 × 9] diamonds-SI2.csv \n#> 3 SI1 [13,065 × 9] diamonds-SI1.csv \n#> 4 VS2 [12,258 × 9] diamonds-VS2.csv \n#> 5 VS1 [8,171 × 9] diamonds-VS1.csv \n#> 6 VVS2 [5,066 × 9] diamonds-VVS2.csv\n#> # ℹ 2 more rows\n\nSo if we were going to save these data frames by hand, we might write something like:\n\nwrite_csv(by_clarity$data[[1]], by_clarity$path[[1]])\nwrite_csv(by_clarity$data[[2]], by_clarity$path[[2]])\nwrite_csv(by_clarity$data[[3]], by_clarity$path[[3]])\n...\nwrite_csv(by_clarity$by_clarity[[8]], by_clarity$path[[8]])\n\nThis is a little different to our previous uses of map() because there are two arguments that are changing, not just one. That means we need a new function: map2(), which varies both the first and second arguments. And because we again don’t care about the output, we want walk2() rather than map2(). That gives us:\n\nwalk2(by_clarity$data, by_clarity$path, write_csv)\n\n\n26.4.3 Saving plots\nWe can take the same basic approach to create many plots. Let’s first make a function that draws the plot we want:\n\ncarat_histogram <- function(df) {\n ggplot(df, aes(x = carat)) + geom_histogram(binwidth = 0.1) \n}\n\ncarat_histogram(by_clarity$data[[1]])\n\n\n\n\nNow we can use map() to create a list of many plots7 and their eventual file paths:\n\nby_clarity <- by_clarity |> \n mutate(\n plot = map(data, carat_histogram),\n path = str_glue(\"clarity-{clarity}.png\")\n )\n\nThen use walk2() with ggsave() to save each plot:\n\nwalk2(\n by_clarity$path,\n by_clarity$plot,\n \\(path, plot) ggsave(path, plot, width = 6, height = 6)\n)\n\nThis is shorthand for:\n\nggsave(by_clarity$path[[1]], by_clarity$plot[[1]], width = 6, height = 6)\nggsave(by_clarity$path[[2]], by_clarity$plot[[2]], width = 6, height = 6)\nggsave(by_clarity$path[[3]], by_clarity$plot[[3]], width = 6, height = 6)\n...\nggsave(by_clarity$path[[8]], by_clarity$plot[[8]], width = 6, height = 6)" + "text": "26.4 Saving multiple outputs\nIn the last section, you learned about map(), which is useful for reading multiple files into a single object. In this section, we’ll now explore sort of the opposite problem: how can you take one or more R objects and save it to one or more files? We’ll explore this challenge using three examples:\n\nSaving multiple data frames into one database.\nSaving multiple data frames into multiple .csv files.\nSaving multiple plots to multiple .png files.\n\n\n26.4.1 Writing to a database\nSometimes when working with many files at once, it’s not possible to fit all your data into memory at once, and you can’t do map(files, read_csv). One approach to deal with this problem is to load your data into a database so you can access just the bits you need with dbplyr.\nIf you’re lucky, the database package you’re using will provide a handy function that takes a vector of paths and loads them all into the database. This is the case with duckdb’s duckdb_read_csv():\n\ncon <- DBI::dbConnect(duckdb::duckdb())\nduckdb::duckdb_read_csv(con, \"gapminder\", paths)\n\nThis would work well here, but we don’t have csv files, instead we have excel spreadsheets. So we’re going to have to do it “by hand”. Learning to do it by hand will also help you when you have a bunch of csvs and the database that you’re working with doesn’t have one function that will load them all in.\nWe need to start by creating a table that will fill in with data. The easiest way to do this is by creating a template, a dummy data frame that contains all the columns we want, but only a sampling of the data. For the gapminder data, we can make that template by reading a single file and adding the year to it:\n\ntemplate <- readxl::read_excel(paths[[1]])\ntemplate$year <- 1952\ntemplate\n#> # A tibble: 142 × 6\n#> country continent lifeExp pop gdpPercap year\n#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>\n#> 1 Afghanistan Asia 28.8 8425333 779. 1952\n#> 2 Albania Europe 55.2 1282697 1601. 1952\n#> 3 Algeria Africa 43.1 9279525 2449. 1952\n#> 4 Angola Africa 30.0 4232095 3521. 1952\n#> 5 Argentina Americas 62.5 17876956 5911. 1952\n#> 6 Australia Oceania 69.1 8691212 10040. 1952\n#> # ℹ 136 more rows\n\nNow we can connect to the database, and use DBI::dbCreateTable() to turn our template into a database table:\n\ncon <- DBI::dbConnect(duckdb::duckdb())\nDBI::dbCreateTable(con, \"gapminder\", template)\n\ndbCreateTable() doesn’t use the data in template, just the variable names and types. So if we inspect the gapminder table now you’ll see that it’s empty but it has the variables we need with the types we expect:\n\ncon |> tbl(\"gapminder\")\n#> # Source: table<gapminder> [0 x 6]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]\n#> # ℹ 6 variables: country <chr>, continent <chr>, lifeExp <dbl>, pop <dbl>,\n#> # gdpPercap <dbl>, year <dbl>\n\nNext, we need a function that takes a single file path, reads it into R, and adds the result to the gapminder table. We can do that by combining read_excel() with DBI::dbAppendTable():\n\nappend_file <- function(path) {\n df <- readxl::read_excel(path)\n df$year <- parse_number(basename(path))\n \n DBI::dbAppendTable(con, \"gapminder\", df)\n}\n\nNow we need to call append_file() once for each element of paths. That’s certainly possible with map():\n\npaths |> map(append_file)\n\nBut we don’t care about the output of append_file(), so instead of map() it’s slightly nicer to use walk(). walk() does exactly the same thing as map() but throws the output away:\n\npaths |> walk(append_file)\n\nNow we can see if we have all the data in our table:\n\ncon |> \n tbl(\"gapminder\") |> \n count(year)\n#> # Source: SQL [?? x 2]\n#> # Database: DuckDB v0.9.1 [unknown@Linux 6.2.0-1015-azure:R 4.3.2/:memory:]\n#> year n\n#> <dbl> <dbl>\n#> 1 2007 142\n#> 2 1967 142\n#> 3 1977 142\n#> 4 1987 142\n#> 5 1952 142\n#> 6 1957 142\n#> # ℹ more rows\n\n\n26.4.2 Writing csv files\nThe same basic principle applies if we want to write multiple csv files, one for each group. Let’s imagine that we want to take the ggplot2::diamonds data and save one csv file for each clarity. First we need to make those individual datasets. There are many ways you could do that, but there’s one way we particularly like: group_nest().\n\nby_clarity <- diamonds |> \n group_nest(clarity)\n\nby_clarity\n#> # A tibble: 8 × 2\n#> clarity data\n#> <ord> <list<tibble[,9]>>\n#> 1 I1 [741 × 9]\n#> 2 SI2 [9,194 × 9]\n#> 3 SI1 [13,065 × 9]\n#> 4 VS2 [12,258 × 9]\n#> 5 VS1 [8,171 × 9]\n#> 6 VVS2 [5,066 × 9]\n#> # ℹ 2 more rows\n\nThis gives us a new tibble with eight rows and two columns. clarity is our grouping variable and data is a list-column containing one tibble for each unique value of clarity:\n\nby_clarity$data[[1]]\n#> # A tibble: 741 × 9\n#> carat cut color depth table price x y z\n#> <dbl> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>\n#> 1 0.32 Premium E 60.9 58 345 4.38 4.42 2.68\n#> 2 1.17 Very Good J 60.2 61 2774 6.83 6.9 4.13\n#> 3 1.01 Premium F 61.8 60 2781 6.39 6.36 3.94\n#> 4 1.01 Fair E 64.5 58 2788 6.29 6.21 4.03\n#> 5 0.96 Ideal F 60.7 55 2801 6.37 6.41 3.88\n#> 6 1.04 Premium G 62.2 58 2801 6.46 6.41 4 \n#> # ℹ 735 more rows\n\nWhile we’re here, let’s create a column that gives the name of output file, using mutate() and str_glue():\n\nby_clarity <- by_clarity |> \n mutate(path = str_glue(\"diamonds-{clarity}.csv\"))\n\nby_clarity\n#> # A tibble: 8 × 3\n#> clarity data path \n#> <ord> <list<tibble[,9]>> <glue> \n#> 1 I1 [741 × 9] diamonds-I1.csv \n#> 2 SI2 [9,194 × 9] diamonds-SI2.csv \n#> 3 SI1 [13,065 × 9] diamonds-SI1.csv \n#> 4 VS2 [12,258 × 9] diamonds-VS2.csv \n#> 5 VS1 [8,171 × 9] diamonds-VS1.csv \n#> 6 VVS2 [5,066 × 9] diamonds-VVS2.csv\n#> # ℹ 2 more rows\n\nSo if we were going to save these data frames by hand, we might write something like:\n\nwrite_csv(by_clarity$data[[1]], by_clarity$path[[1]])\nwrite_csv(by_clarity$data[[2]], by_clarity$path[[2]])\nwrite_csv(by_clarity$data[[3]], by_clarity$path[[3]])\n...\nwrite_csv(by_clarity$by_clarity[[8]], by_clarity$path[[8]])\n\nThis is a little different to our previous uses of map() because there are two arguments that are changing, not just one. That means we need a new function: map2(), which varies both the first and second arguments. And because we again don’t care about the output, we want walk2() rather than map2(). That gives us:\n\nwalk2(by_clarity$data, by_clarity$path, write_csv)\n\n\n26.4.3 Saving plots\nWe can take the same basic approach to create many plots. Let’s first make a function that draws the plot we want:\n\ncarat_histogram <- function(df) {\n ggplot(df, aes(x = carat)) + geom_histogram(binwidth = 0.1) \n}\n\ncarat_histogram(by_clarity$data[[1]])\n\n\n\n\nNow we can use map() to create a list of many plots7 and their eventual file paths:\n\nby_clarity <- by_clarity |> \n mutate(\n plot = map(data, carat_histogram),\n path = str_glue(\"clarity-{clarity}.png\")\n )\n\nThen use walk2() with ggsave() to save each plot:\n\nwalk2(\n by_clarity$path,\n by_clarity$plot,\n \\(path, plot) ggsave(path, plot, width = 6, height = 6)\n)\n\nThis is shorthand for:\n\nggsave(by_clarity$path[[1]], by_clarity$plot[[1]], width = 6, height = 6)\nggsave(by_clarity$path[[2]], by_clarity$plot[[2]], width = 6, height = 6)\nggsave(by_clarity$path[[3]], by_clarity$plot[[3]], width = 6, height = 6)\n...\nggsave(by_clarity$path[[8]], by_clarity$plot[[8]], width = 6, height = 6)" }, { "objectID": "iteration.html#summary", diff --git a/sitemap.xml b/sitemap.xml index aebb4a21a..5c102f5e6 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -1,155 +1,155 @@ <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> - <loc>https://r4ds.hadley.nz/index.html</loc> - <lastmod>2023-11-17T21:40:39.836Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/index.html</loc> + <lastmod>2023-11-17T22:08:26.363Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/preface-2e.html</loc> - <lastmod>2023-11-17T21:40:39.844Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/preface-2e.html</loc> + <lastmod>2023-11-17T22:08:26.371Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/intro.html</loc> - <lastmod>2023-11-17T21:40:39.856Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/intro.html</loc> + <lastmod>2023-11-17T22:08:26.383Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/whole-game.html</loc> - <lastmod>2023-11-17T21:40:39.864Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/whole-game.html</loc> + <lastmod>2023-11-17T22:08:26.391Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/data-visualize.html</loc> - <lastmod>2023-11-17T21:40:39.900Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/data-visualize.html</loc> + <lastmod>2023-11-17T22:08:26.427Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/workflow-basics.html</loc> - <lastmod>2023-11-17T21:40:39.912Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/workflow-basics.html</loc> + <lastmod>2023-11-17T22:08:26.439Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/data-transform.html</loc> - <lastmod>2023-11-17T21:40:39.968Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/data-transform.html</loc> + <lastmod>2023-11-17T22:08:26.515Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/workflow-style.html</loc> - <lastmod>2023-11-17T21:40:39.984Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/workflow-style.html</loc> + <lastmod>2023-11-17T22:08:26.531Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/data-tidy.html</loc> - <lastmod>2023-11-17T21:40:40.044Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/data-tidy.html</loc> + <lastmod>2023-11-17T22:08:26.567Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/workflow-scripts.html</loc> - <lastmod>2023-11-17T21:40:40.056Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/workflow-scripts.html</loc> + <lastmod>2023-11-17T22:08:26.579Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/data-import.html</loc> - <lastmod>2023-11-17T21:40:40.088Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/data-import.html</loc> + <lastmod>2023-11-17T22:08:26.607Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/workflow-help.html</loc> - <lastmod>2023-11-17T21:40:40.096Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/workflow-help.html</loc> + <lastmod>2023-11-17T22:08:26.619Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/visualize.html</loc> - <lastmod>2023-11-17T21:40:40.108Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/visualize.html</loc> + <lastmod>2023-11-17T22:08:26.623Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/layers.html</loc> - <lastmod>2023-11-17T21:40:40.152Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/layers.html</loc> + <lastmod>2023-11-17T22:08:26.667Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/EDA.html</loc> - <lastmod>2023-11-17T21:40:40.184Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/EDA.html</loc> + <lastmod>2023-11-17T22:08:26.695Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/communication.html</loc> - <lastmod>2023-11-17T21:40:40.244Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/communication.html</loc> + <lastmod>2023-11-17T22:08:26.803Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/transform.html</loc> - <lastmod>2023-11-17T21:40:40.252Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/transform.html</loc> + <lastmod>2023-11-17T22:08:26.811Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/logicals.html</loc> - <lastmod>2023-11-17T21:40:40.292Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/logicals.html</loc> + <lastmod>2023-11-17T22:08:26.847Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/numbers.html</loc> - <lastmod>2023-11-17T21:40:40.340Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/numbers.html</loc> + <lastmod>2023-11-17T22:08:26.903Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/strings.html</loc> - <lastmod>2023-11-17T21:40:40.440Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/strings.html</loc> + <lastmod>2023-11-17T22:08:26.939Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/regexps.html</loc> - <lastmod>2023-11-17T21:40:40.488Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/regexps.html</loc> + <lastmod>2023-11-17T22:08:26.987Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/factors.html</loc> - <lastmod>2023-11-17T21:40:40.520Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/factors.html</loc> + <lastmod>2023-11-17T22:08:27.015Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/datetimes.html</loc> - <lastmod>2023-11-17T21:40:40.564Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/datetimes.html</loc> + <lastmod>2023-11-17T22:08:27.063Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/missing-values.html</loc> - <lastmod>2023-11-17T21:40:40.588Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/missing-values.html</loc> + <lastmod>2023-11-17T22:08:27.083Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/joins.html</loc> - <lastmod>2023-11-17T21:40:40.632Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/joins.html</loc> + <lastmod>2023-11-17T22:08:27.131Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/import.html</loc> - <lastmod>2023-11-17T21:40:40.640Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/import.html</loc> + <lastmod>2023-11-17T22:08:27.135Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/spreadsheets.html</loc> - <lastmod>2023-11-17T21:40:40.668Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/spreadsheets.html</loc> + <lastmod>2023-11-17T22:08:27.167Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/databases.html</loc> - <lastmod>2023-11-17T21:40:40.712Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/databases.html</loc> + <lastmod>2023-11-17T22:08:27.207Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/arrow.html</loc> - <lastmod>2023-11-17T21:40:40.732Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/arrow.html</loc> + <lastmod>2023-11-17T22:08:27.227Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/rectangling.html</loc> - <lastmod>2023-11-17T21:40:40.784Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/rectangling.html</loc> + <lastmod>2023-11-17T22:08:27.275Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/webscraping.html</loc> - <lastmod>2023-11-17T21:40:40.812Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/webscraping.html</loc> + <lastmod>2023-11-17T22:08:27.303Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/program.html</loc> - <lastmod>2023-11-17T21:40:40.820Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/program.html</loc> + <lastmod>2023-11-17T22:08:27.311Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/functions.html</loc> - <lastmod>2023-11-17T21:40:40.884Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/functions.html</loc> + <lastmod>2023-11-17T22:08:27.375Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/iteration.html</loc> - <lastmod>2023-11-17T21:40:40.952Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/iteration.html</loc> + <lastmod>2023-11-17T22:08:27.447Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/base-R.html</loc> - <lastmod>2023-11-17T21:40:40.984Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/base-R.html</loc> + <lastmod>2023-11-17T22:08:27.475Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/communicate.html</loc> - <lastmod>2023-11-17T21:40:40.988Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/communicate.html</loc> + <lastmod>2023-11-17T22:08:27.483Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/quarto.html</loc> - <lastmod>2023-11-17T21:40:41.016Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/quarto.html</loc> + <lastmod>2023-11-17T22:08:27.515Z</lastmod> </url> <url> - <loc>https://r4ds.hadley.nz/quarto-formats.html</loc> - <lastmod>2023-11-17T21:40:41.032Z</lastmod> + <loc>https://cienciadedatos.github.io/pt-r4ds/quarto-formats.html</loc> + <lastmod>2023-11-17T22:08:27.559Z</lastmod> </url> </urlset> diff --git a/spreadsheets.html b/spreadsheets.html index c2d428aa3..b7d614090 100644 --- a/spreadsheets.html +++ b/spreadsheets.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -422,7 +422,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">20.4</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/spreadsheets.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/spreadsheets.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/strings.html b/strings.html index 06a75f0f1..f6552282f 100644 --- a/strings.html +++ b/strings.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -444,7 +444,7 @@ </ul> </li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">14.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/strings.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/strings.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/transform.html b/transform.html index 33e5fd685..f5abc74c8 100644 --- a/transform.html +++ b/transform.html @@ -79,7 +79,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -656,7 +656,7 @@ <h1 class="title"><span id="sec-transform-intro" class="quarto-section-identifie <div class="nav-footer-left">R para Ciência de Dados (2ª edição) foi escrito por Hadley Wickham, Mine Çetinkaya-Rundel, e Garrett Grolemund.</div> <div class="nav-footer-center"> - <div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/transform.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> + <div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/transform.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> <div class="nav-footer-right">Este livro foi contruído com <a href="https://quarto.org/">Quarto</a>.</div> </div> </footer> diff --git a/visualize.html b/visualize.html index 153336eac..b317d62a7 100644 --- a/visualize.html +++ b/visualize.html @@ -79,7 +79,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -642,7 +642,7 @@ <h1 class="title"><span id="sec-visualize" class="quarto-section-identifier">Vis <div class="nav-footer-left">R para Ciência de Dados (2ª edição) foi escrito por Hadley Wickham, Mine Çetinkaya-Rundel, e Garrett Grolemund.</div> <div class="nav-footer-center"> - <div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/visualize.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> + <div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/visualize.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> <div class="nav-footer-right">Este livro foi contruído com <a href="https://quarto.org/">Quarto</a>.</div> </div> </footer> diff --git a/webscraping.html b/webscraping.html index 161d96edc..88a184162 100644 --- a/webscraping.html +++ b/webscraping.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -435,7 +435,7 @@ </li> <li><a href="#dynamic-sites" id="toc-dynamic-sites" class="nav-link" data-scroll-target="#dynamic-sites"><span class="header-section-number">24.7</span> Dynamic sites</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">24.8</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/webscraping.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/webscraping.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/whole-game.html b/whole-game.html index e0ca85a0d..9a4107535 100644 --- a/whole-game.html +++ b/whole-game.html @@ -79,7 +79,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -643,7 +643,7 @@ <h1 class="title"><span id="sec-whole-game-intro" class="quarto-section-identifi <div class="nav-footer-left">R para Ciência de Dados (2ª edição) foi escrito por Hadley Wickham, Mine Çetinkaya-Rundel, e Garrett Grolemund.</div> <div class="nav-footer-center"> - <div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/whole-game.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> + <div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/whole-game.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></div> <div class="nav-footer-right">Este livro foi contruído com <a href="https://quarto.org/">Quarto</a>.</div> </div> </footer> diff --git a/workflow-basics.html b/workflow-basics.html index 51bf0c665..5be5389cc 100644 --- a/workflow-basics.html +++ b/workflow-basics.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -401,7 +401,7 @@ <li><a href="#chamando-fun%C3%A7%C3%B5es" id="toc-chamando-funções" class="nav-link" data-scroll-target="#chamando-fun%C3%A7%C3%B5es"><span class="header-section-number">2.4</span> Chamando funções</a></li> <li><a href="#exerc%C3%ADcios" id="toc-exercícios" class="nav-link" data-scroll-target="#exerc%C3%ADcios"><span class="header-section-number">2.5</span> Exercícios</a></li> <li><a href="#sum%C3%A1rio" id="toc-sumário" class="nav-link" data-scroll-target="#sum%C3%A1rio"><span class="header-section-number">2.6</span> Sumário</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/workflow-basics.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/workflow-basics.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/workflow-help.html b/workflow-help.html index 571e2fa6b..7560352d3 100644 --- a/workflow-help.html +++ b/workflow-help.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -399,7 +399,7 @@ <li><a href="#making-a-reprex" id="toc-making-a-reprex" class="nav-link" data-scroll-target="#making-a-reprex"><span class="header-section-number">8.2</span> Making a reprex</a></li> <li><a href="#investing-in-yourself" id="toc-investing-in-yourself" class="nav-link" data-scroll-target="#investing-in-yourself"><span class="header-section-number">8.3</span> Investing in yourself</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">8.4</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/workflow-help.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/workflow-help.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/workflow-scripts.html b/workflow-scripts.html index 537191391..cd1f49b21 100644 --- a/workflow-scripts.html +++ b/workflow-scripts.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -414,7 +414,7 @@ </li> <li><a href="#exercises" id="toc-exercises" class="nav-link" data-scroll-target="#exercises"><span class="header-section-number">6.3</span> Exercises</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">6.4</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/workflow-scripts.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/workflow-scripts.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title"> diff --git a/workflow-style.html b/workflow-style.html index 9e884f1d7..f9437c8f7 100644 --- a/workflow-style.html +++ b/workflow-style.html @@ -113,7 +113,7 @@ <div class="sidebar-title mb-0 py-0"> <a href="./">R para Ciência de Dados (2ª edição)</a> <div class="sidebar-tools-main"> - <a href="https://cienciadedatos.github.io/pt-r4ds" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-git"></i></a> + <a href="https://github.com/cienciadedatos/pt-r4ds/" rel="" title="Source Code" class="quarto-navigation-tool px-1" aria-label="Source Code"><i class="bi bi-github"></i></a> <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Alternar modo de leitor"> <div class="quarto-reader-toggle-btn"> <i class="bi"></i> @@ -402,7 +402,7 @@ <li><a href="#sectioning-comments" id="toc-sectioning-comments" class="nav-link" data-scroll-target="#sectioning-comments"><span class="header-section-number">4.5</span> Sectioning comments</a></li> <li><a href="#exercises" id="toc-exercises" class="nav-link" data-scroll-target="#exercises"><span class="header-section-number">4.6</span> Exercises</a></li> <li><a href="#summary" id="toc-summary" class="nav-link" data-scroll-target="#summary"><span class="header-section-number">4.7</span> Summary</a></li> - </ul><div class="toc-actions"><div><i class="bi bi-git"></i></div><div class="action-links"><p><a href="https://cienciadedatos.github.io/pt-r4ds/edit/main/workflow-style.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://cienciadedatos.github.io/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> + </ul><div class="toc-actions"><div><i class="bi bi-github"></i></div><div class="action-links"><p><a href="https://github.com/cienciadedatos/pt-r4ds/edit/main/workflow-style.qmd" class="toc-action">Editar essa página</a></p><p><a href="https://github.com/cienciadedatos/pt-r4ds/issues/new" class="toc-action">Criar uma issue</a></p></div></div></nav> </div> <!-- main --> <main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block default"><div class="quarto-title">