googlesheets4
Overviewβ
googlesheets4 provides an R interface to Google Sheets via the Sheets API v4. It is a reboot of an earlier package called googlesheets.
Why 4? Why googlesheets4? Did I miss googlesheets1 through 3? No.Β The idea is to name the package after the corresponding version of the Sheets API. In hindsight, the original googlesheets should have been googlesheets3.
Installationβ
You can install the released version of googlesheets4 from CRAN with:
install.packages("googlesheets4")
And the development version from GitHub with:
#install.packages("pak")
pak::pak("tidyverse/googlesheets4")
Cheatsheetβ
You can see how to read data with googlesheets4 in the data import cheatsheet, which also covers similar functionality in the related packages readr and readxl.
Authβ
googlesheets4 will, by default, help you interact with Sheets as an
authenticated Google user. If you donβt plan to write Sheets or to read
private Sheets, use gs4_deauth()
to indicate there is no need for a
token. See the article googlesheets4
auth
for more.
For this overview, weβve logged into Google as a specific user in a hidden chunk.
Attach googlesheets4β
library(googlesheets4)
Readβ
The main βreadβ function of the googlesheets4 package goes by two names, because we want it to make sense in two contexts:
read_sheet()
evokes other table-reading functions, likereadr::read_csv()
andreadxl::read_excel()
. Thesheet
in this case refers to a Google (spread)Sheet.range_read()
is the right name according to the naming convention used throughout the googlesheets4 package.
read_sheet()
and range_read()
are synonyms and you can use either
one. Here weβll use read_sheet()
.
googlesheets4 is pipe-friendly (and
reexports %>%
), but works just fine without the pipe.
Read from
- a URL
- a Sheet ID
- a
dribble
produced by the googledrive package, which can lookup by file name
These all achieve the same thing:
# URL
read_sheet("https://docs.google.com/spreadsheets/d/1U6Cf_qEOhiR9AZqTqS3mbMF3zt2db48ZP5v3rkrAEJY/edit#gid=780868077")
#> β Reading from "gapminder".
#> β Range 'Africa'.
#> # A tibble: 624 Γ 6
#> country continent year lifeExp pop gdpPercap
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Algeria Africa 1952 43.1 9279525 2449.
#> 2 Algeria Africa 1957 45.7 10270856 3014.
#> 3 Algeria Africa 1962 48.3 11000948 2551.
#> 4 Algeria Africa 1967 51.4 12760499 3247.
#> 5 Algeria Africa 1972 54.5 14760787 4183.
#> # βΉ 619 more rows
# Sheet ID
read_sheet("1U6Cf_qEOhiR9AZqTqS3mbMF3zt2db48ZP5v3rkrAEJY")
#> β Reading from "gapminder".
#> β Range 'Africa'.
#> # A tibble: 624 Γ 6
#> country continent year lifeExp pop gdpPercap
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Algeria Africa 1952 43.1 9279525 2449.
#> 2 Algeria Africa 1957 45.7 10270856 3014.
#> 3 Algeria Africa 1962 48.3 11000948 2551.
#> 4 Algeria Africa 1967 51.4 12760499 3247.
#> 5 Algeria Africa 1972 54.5 14760787 4183.
#> # βΉ 619 more rows
# a googledrive "dribble"
googledrive::drive_get("gapminder") %>%
read_sheet()
#> β The input `path` resolved to exactly 1 file.
#> β Reading from "gapminder".
#> β Range 'Africa'.
#> # A tibble: 624 Γ 6
#> country continent year lifeExp pop gdpPercap
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Algeria Africa 1952 43.1 9279525 2449.
#> 2 Algeria Africa 1957 45.7 10270856 3014.
#> 3 Algeria Africa 1962 48.3 11000948 2551.
#> 4 Algeria Africa 1967 51.4 12760499 3247.
#> 5 Algeria Africa 1972 54.5 14760787 4183.
#> # βΉ 619 more rows
Note: the only reason we can read a sheet named βgapminderβ (the last example) is because the account weβre logged in as has a Sheet named βgapminderβ.
See the article Find and Identify Sheets for more about specifying the Sheet you want to address. See the article Read Sheets for more about reading from specific sheets or ranges, setting column type, and getting low-level cell data.
Writeβ
gs4_create()
creates a brand new Google Sheet and can optionally send
some initial data.
(ss <- gs4_create("fluffy-bunny", sheets = list(flowers = head(iris))))
#> β Creating new Sheet: "fluffy-bunny".
#>
#> ββ <googlesheets4_spreadsheet> βββββββββββββββββββββββββββββββββββββββββββββββββ
#> Spreadsheet name: "fluffy-bunny"
#> ID: 1enILX4tYJeFEJ1RL8MsGgDRjb0NHTdm3ZD92R2RMWYI
#> Locale: en_US
#> Time zone: Etc/GMT
#> # of sheets: 1
#>
#> ββ <sheets> ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> (Sheet name): (Nominal extent in rows x columns)
#> 'flowers': 7 x 5
sheet_write()
(over)writes a whole data frame into a (work)sheet
within a (spread)Sheet.
head(mtcars) %>%
sheet_write(ss, sheet = "autos")
#> β Writing to "fluffy-bunny".
#> β Writing to sheet 'autos'.
ss
#>
#> ββ <googlesheets4_spreadsheet> βββββββββββββββββββββββββββββββββββββββββββββββββ
#> Spreadsheet name: "fluffy-bunny"
#> ID: 1enILX4tYJeFEJ1RL8MsGgDRjb0NHTdm3ZD92R2RMWYI
#> Locale: en_US
#> Time zone: Etc/GMT
#> # of sheets: 2
#>
#> ββ <sheets> ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> (Sheet name): (Nominal extent in rows x columns)
#> 'flowers': 7 x 5
#> 'autos': 7 x 11
sheet_append()
, range_write()
, range_flood()
, and range_clear()
are more specialized writing functions. See the article Write
Sheets
for more about writing to Sheets.
Where to learn moreβ
Get started is a more extensive general introduction to googlesheets4.
Browse the articles index to find articles that cover various topics in more depth.
See the function index for an organized, exhaustive listing.
Contributingβ
If youβd like to contribute to the development of googlesheets4, please read these guidelines.
Please note that the googlesheets4 project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Privacyβ
Contextβ
googlesheets4 draws on and complements / emulates other packages in the tidyverse:
- googlesheets is the package that googlesheets4 replaces. Main improvements in googlesheets4: (1) wraps the current, most modern Sheets API; (2) leaves all βwhole fileβ operations to googledrive; and (3) uses shared infrastructure for auth and more, from the gargle package. The v3 API wrapped by googlesheets is deprecated. Starting in April/May 2020, features will gradually be disabled and itβs anticipated the API will fully shutdown in September 2020. At that point, the original googlesheets package must be retired.
- googledrive provides a fully-featured interface to the Google Drive API. Any βwhole fileβ operations can be accomplished with googledrive: upload or download or update a spreadsheet, copy, rename, move, change permission, delete, etc. googledrive supports Team Drives.
- readxl is the tidyverse package for reading Excel files (xls or xlsx) into an R data frame. googlesheets4 takes cues from parts of the readxl interface, especially around specifying which cells to read.
- readr is the tidyverse package for reading delimited files (e.g., csv or tsv) into an R data frame. googlesheets4 takes cues from readr with respect to column type specification.