Google Sheets Data Interchange
This demo focuses on external data processing. For Google Apps Script custom functions, the "Google Sheets" extension demo covers Apps Script integration.
Google Sheets is a collaborative spreadsheet service with powerful external APIs for automation.
SheetJS is a JavaScript library for reading and writing data from spreadsheets.
This demo uses SheetJS to properly exchange data with spreadsheet files. We'll explore how to use NodeJS integration libraries and SheetJS in three data flows:
-
"Importing data": Data in a NUMBERS spreadsheet will be parsed using SheetJS libraries and written to a Google Sheets Document
-
"Exporting data": Data in Google Sheets will be pulled into arrays of objects. A workbook will be assembled and exported to Excel Binary workbooks (XLSB).
-
"Exporting files": SheetJS libraries will read XLSX files exported by Google Sheets and generate CSV rows from every worksheet.
It is strongly recommended to create a new Google account for testing.
One small mistake could result in a block or ban from Google services.
Google Sheets deprecates APIs quickly and there is no guarantee that the referenced APIs will be available in the future.
Integration Detailsโ
This demo uses the Sheets v4 and Drive v3 APIs through the official googleapis
connector module.
There are a number of steps to enable the Google Sheets API and Google Drive API for an account. The Complete Example covers the process.
Document Dualityโ
Each Google Sheets document is identified with a unique ID. This ID can be found from the Google Sheets edit URL.
The edit URL starts with https://docs.google.com/spreadsheets/d/
and includes
/edit
. The ID is the string of characters between the slashes. For example:
https://docs.google.com/spreadsheets/d/a_long_string_of_characters/edit#gid=0
|^^^^^^^^^^^^^^^^^^^^^^^^^^^|--- ID
The same ID is used in Google Drive operations.
The following operations are covered in this demo:
Operation | API |
---|---|
Create Google Sheets Document | Sheets |
Add and Remove worksheets | Sheets |
Modify data in worksheets | Sheets |
Share Sheets with other users | Drive |
Generate raw file exports | Drive |
Authenticationโ
It is strongly recommended to use a service account for Google API operations. The "Service Account Setup" section covers how to create a service account and generate a JSON key file.
The generated JSON key file includes client_email
and private_key
fields.
These fields can be used in JWT authentication:
import { google } from "googleapis";
// adjust the path to the actual key file.
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
/* connect to google services */
const jwt = new google.auth.JWT({
email: creds.client_email,
key: creds.private_key,
scopes: [
'https://www.googleapis.com/auth/spreadsheets', // Google Sheets
'https://www.googleapis.com/auth/drive.file', // Google Drive
]
});
Connecting to Servicesโ
The google
named export includes special methods to connect to various APIs.
Google Sheetsโ
const sheets = google.sheets({ version: "v4", auth: jwt });
google.sheets
takes an options argument that includes API version number and
authentication details.
Google Driveโ
const drive = google.drive({ version: "v3", auth: jwt });
google.drive
takes an options argument that includes API version number and
authentication details.
Array of Arraysโ
"Arrays of Arrays" are the main data format for interchange with Google Sheets. The outer array object includes row arrays, and each row array includes data.
SheetJS provides methods for working with Arrays of Arrays:
aoa_to_sheet
1 creates SheetJS worksheet objects from arrays of arrayssheet_to_json
2 can generate arrays of arrays from SheetJS worksheets
Export Document Dataโ
The goal is to create an XLSB export from a Google Sheet. Google Sheets does not natively support the XLSB format. SheetJS fills the gap.
Convert a Single Sheetโ
sheets.spreadsheets.values.get
returns data from an existing Google Sheet. The
method expects a range. Passing the sheet name as the title will pull all rows.
If successful, the response object will have a data
property. It will be an
object with a values
property. The values will be represented as an Array of
Arrays of values. This array of arrays can be converted to a SheetJS sheet:
async function gsheet_ws_to_sheetjs_ws(id, sheet_name) {
/* get values */
const res = await sheets.spreadsheets.values.get({
spreadsheetId: id,
range: `'${sheet_name}'`
});
const values = res.data.values;
/* create SheetJS worksheet */
const ws = XLSX.utils.aoa_to_sheet(values);
return ws;
}
Convert a Workbookโ
sheets.spreadsheets.get
returns metadata about the Google Sheets document. In
the result object, the data
property is an object which has a sheets
property. The value of the sheets
property is an array of sheet objects.
The SheetJS book_new
3 method creates blank SheetJS workbook objects. The
book_append_sheet
4 method adds SheetJS worksheet objects to the workbook.
By looping across the sheets, the entire workbook can be converted:
async function gsheet_doc_to_sheetjs_wb(doc) {
/* Create a new workbook object */
const wb = XLSX.utils.book_new();
/* Get metadata */
const wsheet = await sheets.spreadsheets.get({spreadsheetId: id});
/* Loop across the Document sheets */
for(let sheet of wsheet.data.sheets) {
/* Get the worksheet name */
const name = sheet.properties.title;
/* Convert Google Docs sheet to SheetJS worksheet */
const ws = await gsheet_ws_to_sheetjs_ws(id, name);
/* Append worksheet to workbook */
XLSX.utils.book_append_sheet(wb, ws, name);
}
return wb;
}
This method returns a SheetJS workbook object that can be exported with the
writeFile
and write
methods.5
Update Document Dataโ
The goal is to import data from a NUMBERS file to Google Sheets. Google Sheets does not natively support the NUMBERS format. SheetJS fills the gap.
Create New Documentโ
sheets.spreadsheets.create
creates a new Google Sheets document. It can accept
a document title. It will generate a new workbook with a blank "Sheet1" sheet.
The response includes the document ID for use in subsequent operations:
const res = await sheets.spreadsheets.create({
requestBody: {
properties: {
/* Document Title */
title: "SheetJS Test"
}
}
});
const id = res.data.spreadsheetId;
When using a service worker, the main account does not have access to the new document by default. The document has to be shared with the main account using the Drive API:
await drive.permissions.create({
fileId: id, // this ID was returned in the response to the create request
fields: "id",
requestBody: {
type: "user",
role: "writer",
emailAddress: "YOUR_ADDRESS@gmail.com" // main address
}
});
Delete Non-Initial Sheetsโ
Google Sheets does not allow users to delete every worksheet.
The recommended approach involves deleting every worksheet after the first.
The delete operation requires a unique identifier for a sheet within the Google
Sheets document. These IDs are found in the sheets.spreadsheets.get
response.
The following snippet performs one bulk operation using batchUpdate
:
/* get existing sheets */
const wsheet = await sheets.spreadsheets.get({spreadsheetId: id});
/* remove all sheets after the first */
if(wsheet.data.sheets.length > 1) await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: wsheet.data.sheets.slice(1).map(s => ({
deleteSheet: {
sheetId: s.properties.sheetId
}
}))}
});
Rename First Sheetโ
The first sheet must be renamed so that the append operations do not collide with the legacy name. Since most SheetJS-supported file formats and most spreadsheet applications limit worksheet name lengths to 32 characters, it is safe to set a name that exceeds 33 characters.
The updateSheetProperties
update method can rename individual sheets:
/* rename first worksheet to avoid collisions */
await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: [{
updateSheetProperties: {
fields: "title",
properties: {
sheetId: wsheet.data.sheets[0].properties.sheetId,
// the new title is 34 characters, to be exact
title: "thistitleisatleast33characterslong"
}
}
}]}
});
Append Worksheetsโ
The read
and readFile
methods generate SheetJS
workbook objects from existing worksheet files.
Starting from a SheetJS workbook, the SheetNames
property6 is an array of
worksheet names and the Sheets
property is an object that maps sheet names to
worksheet objects.
Looping over the worksheet names, there are two steps to appending a sheet:
-
"Append a blank worksheet": The
addSheet
request, submitted through thesheets.spreadsheets.batchUpdate
method, accepts a new title and creates a new worksheet. The new worksheet will be added at the end. -
"Write data to the new sheet": The SheetJS
sheet_to_json
method with the optionheader: 1
7 will generate an array of arrays of data. This structure is compatible with thesheets.spreadsheets.values.update
operation.
The following snippet pushes all worksheets from a SheetJS workbook into a Google Sheets document:
/* add sheets from file */
for(let name of wb.SheetNames) {
/* (1) Create a new Google Sheets sheet */
await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: [
/* add new sheet */
{ addSheet: { properties: { title: name } } },
] }
});
/* (2) Push data */
const aoa = XLSX.utils.sheet_to_json(wb.Sheets[name], {header:1});
await sheets.spreadsheets.values.update({
spreadsheetId: id,
range: `'${name}'!A1`,
valueInputOption: "USER_ENTERED",
resource: { values: aoa }
});
}
Delete Initial Sheetโ
After adding new worksheets, the final step involves removing the initial sheet.
The initial sheet ID can be pulled from the worksheet metadata fetched when the non-initial sheets were removed:
/* remove first sheet */
await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: [
/* remove old first sheet */
{ deleteSheet: { sheetId: wsheet.data.sheets[0].properties.sheetId } }
] }
});
Raw File Exportsโ
In the web interface, Google Sheets can export documents to XLSX
or ODS
.
Raw file exports are exposed through the files.export
method in the Drive API:
const drive = google.drive({ version: "v3", auth: jwt });
/* Request XLSX export */
const file = await drive.files.export({
/* XLSX MIME type */
mimeType: "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
fileId: id
});
The mimeType
property is expected to be one of the supported formats8. When
the demo was last tested, the following workbook conversions were supported:
Format | MIME Type |
---|---|
XLSX | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
ODS | application/x-vnd.oasis.opendocument.spreadsheet |
The response object has a data
field whose value will be a Blob
object. Data
can be pulled into an ArrayBuffer
and passed to the SheetJS read
9 method:
/* Obtain ArrayBuffer */
const ab = await file.data.arrayBuffer();
/* Parse */
const wb = XLSX.read(buf);
The code snippet works for XLSX and ODS. Google Sheets supports other formats with different integration logic.
Plaintext
The following formats are considered "plaintext":
Format | MIME Type |
---|---|
CSV (first sheet) | text/csv |
TSV (first sheet) | text/tab-separated-values |
For these formats, file.data
is a JS string that can be parsed directly:
/* Request CSV export */
const file = await drive.files.export({ mimeType: "text/csv", fileId: id });
/* Parse CSV string*/
const wb = XLSX.read(file.data, {type: "string"});
HTML
Google Sheets has one relevant HTML type:
Format | MIME Type |
---|---|
HTML (all sheets) | application/zip |
The HTML export of a Google Sheets worksheet includes a row for the column
labels (A
, B
, ...) and a column for the row labels (1
, 2
, ...).
The complete package is a ZIP file that includes a series of .html
files.
The files are written in tab order. The name of each file matches the name in
Google Sheets.
This ZIP can be extracted using the embedded CFB library:
import { read, utils, CFB } from 'xlsx';
// -------------------^^^-- `CFB` named import
// ...
/* Parse Google Sheets ZIP file */
const cfb = CFB.read(new Uint8Array(ab), {type: "array"});
/* Create new SheetJS workbook */
const wb = utils.book_new();
/* Scan through each entry in the ZIP */
cfb.FullPaths.forEach((n, i) => {
/* only process HTML files */
if(n.slice(-5) != ".html") return;
/* Extract worksheet name */
const name = n.slice(n.lastIndexOf("/")+1).slice(0,-5);
/* parse HTML */
const htmlwb = read(cfb.FileIndex[i].content);
/* add worksheet to workbook */
utils.book_append_sheet(wb, htmlwb.Sheets.Sheet1, name);
});
At this point wb
is a SheetJS workbook object10.
Complete Exampleโ
This demo was last tested on 2024 June 08 using googleapis
version 140.0.0
.
The demo uses Sheets v4 and Drive v3 APIs.
The Google Cloud web interface changes frequently!
The screenshots and detailed descriptions may be out of date. Please report any issues to the docs repo or reach out to the SheetJS Discord server.
Account Setupโ
- Create a new Google account or log into an existing account.
A valid phone number (for SMS verification) may be required.
- Open https://console.cloud.google.com in a web browser.
If this is the first time accessing Google Cloud resources, a terms of service modal will be displayed. Review the Google Cloud Platform Terms of Service by clicking the "Google Cloud Platform Terms of Service" link.
You must agree to the Google Cloud Platform Terms of Service to use the APIs.
Check the box under "Terms of Service" and click "AGREE AND CONTINUE".
Project Setupโ
The goal of this section is to create a new project.
- Open the Project Selector.
In the top bar, between the "Google Cloud" logo and the search bar, there will
be a selection box. Click the โผ
icon to show the modal.
If the selection box is missing, expand the browser window.
-
Click "NEW PROJECT" in the top right corner of the modal.
-
In the New Project screen, enter "SheetJS Test" in the Project name textbox and select "No organization" in the Location box. Click "CREATE".
A notification will confirm that the project was created:
API Setupโ
The goal of this section is to enable Google Sheets API and Google Drive API.
-
Open the Project Selector (
โผ
icon) and select "SheetJS Test" -
In the search bar, type "Enabled" and select "Enabled APIs & services". This item will be in the "PRODUCTS & PAGES" part of the search results.
Enable Google Sheets APIโ
-
Near the top of the page, click "+ ENABLE APIS AND SERVICES".
-
In the search bar near the middle of the page (not the search bar at the top), type "Sheets" and press Enter.
In the results page, look for "Google Sheets API". Click the card
-
In the Product Details screen, click the blue "ENABLE" button.
-
Click the left arrow (
<-
) next to "API/Service details".
Enable Google Drive APIโ
-
Near the top of the page, click "+ ENABLE APIS AND SERVICES".
-
In the search bar near the middle of the page (not the search bar at the top), type "Drive" and press Enter.
In the results page, look for "Google Drive API". Click the card
- In the Product Details screen, click the blue "ENABLE" button.