{"id":344580,"date":"2021-07-22T07:44:26","date_gmt":"2021-07-22T14:44:26","guid":{"rendered":"https:\/\/css-tricks.com\/?p=344580"},"modified":"2021-08-02T18:56:57","modified_gmt":"2021-08-03T01:56:57","slug":"using-google-drive-as-a-cms","status":"publish","type":"post","link":"https:\/\/css-tricks.com\/using-google-drive-as-a-cms\/","title":{"rendered":"Using Google Drive as a CMS"},"content":{"rendered":"\n
We\u2019re going to walk through the technical process of hooking into Google Drive\u2019s API to source content on a website. We\u2019ll examine the step-by-step implementation, as well as how to utilize server-side caching to avoid the major pitfalls to avoid such as API usage limits and image hotlinking. A ready-to-use npm package, Git repo, and Docker image are provided throughout the article.<\/p>\n\n\n\n\n\n\n
At some point in the development of a website, a crossroads is reached: how is content managed when the person managing it isn\u2019t technically savvy? If the content is managed by developers indefinitely, pure HTML and CSS will suffice \u2014 but this prevents wider team collaboration; besides, no developer wants to be on the hook for content updates in perpetuity.<\/p>\n\n\n\n
So what happens when a new non-technical partner needs to gain edit access? This could be a designer, a product manager, a marketing person, a company executive, or even an end customer.<\/p>\n\n\n\n
That\u2019s what a good content management system is for, right? Maybe something like WordPress. But this comes with its own set up of disadvantages: it\u2019s a new platform for your team to juggle, a new interface to learn, and a new vector for potential attackers. It requires creating templates, a format with its own syntax and idiosyncrasies. Custom or third-party plugins may need to be to vetted, installed, and configured for unique use cases \u2014 and each of these is yet another source of complexity, friction, technical debt, and risk. The bloat of all this setup may end up cramping your tech in a way which is counterproductive to the actual purpose of the website.<\/p>\n\n\n\n
What if we could pull content from where it already is? That\u2019s what we\u2019re getting at here. Many of the places where I have worked use Google Drive to organize and share files, and that includes things like blog and landing page content drafts. Could we utilize Google Drive\u2019s API to import a Google Doc directly into a site as raw HTML, with a simple REST request?<\/p>\n\n\n\n
Of course we can! Here\u2019s how we did it where I work.<\/p>\n\n\n
Just a few things you may want to check out as we get started:<\/p>\n\n\n\n
The first step is to establish a connection to Google Drive\u2019s API, and for that, we will need to do some kind of authentication. That\u2019s a requirement to use the Drive API even if the files in question are publicly shared (with \u201clink sharing\u201d turned on). Google supports several methods of doing this. The most common is OAuth, which prompts the user with a Google-branded screen saying, \u201c[So-and-so app] wants to access your Google Drive\u201d and waits for user consent \u2014 not exactly what we need here, since we\u2019d like to access files in a single central drive, rather than the user\u2019s drive. Plus, it\u2019s a bit tricky to provide access to only particular files or folders. The See and download all your Google Drive files.<\/p><\/blockquote>\n\n\n\n That\u2019s exactly what it says on the consent screen. This is potentially alarming for a user, and more to the point, it is a potential security weakness on any central developer\/admin Google account that manages the website content; anything they can access is exposed through the site\u2019s CMS back end, including their own documents and anything shared with them. Not good!<\/p>\n\n\n Instead, we can make use of a slightly less common authentication method: a Google service account. Think of a service account like a dummy Google account used exclusively by APIs and bots. However, it behaves like a first-class Google account; it has its own email address, its own tokens for authentication, and its own permissions. The big win here is that we make files available to this dummy service account just like any other user \u2014 by sharing the file with the service account\u2019s email address, which looks something like this:<\/p>\n\n\n\n When we go to display a doc or sheet on the website, we simply hit the \u201cShare\u201d button and paste in that email address. Now the service account can see only the files or folders we\u2019ve explicitly shared with it, and that access can be modified or revoked at any time. Perfect!<\/p>\n\n\n A service account can be created (for free) from the Google Cloud Platform Console<\/a>. That process is well documented in Google\u2019s developer resources<\/a>, and in addition it\u2019s described in step-by-step detail in the companion repo of this article on GitHub<\/a>. For the sake of brevity, let\u2019s fast-forward to immediately after a successful authentication of a service account.<\/p>\n\n\n Now that we\u2019re in, we\u2019re ready to start tinkering with what the Drive API is capable of. We can start from a modified version of the Node.js quickstart sample<\/a>, adjusted to use our new service account instead of client OAuth. That\u2019s handled in the first several methods of the One more note about the setup here \u2014 this code is intended to be called from server-side Node.js code. That\u2019s because the client credentials for the service account must be kept secret, and not exposed to users of our website. They are kept in a After the stage is set, loading raw HTML from a Google Doc becomes fairly simple. A method like this returns a Promise of an HTML string:<\/p>\n\n\n\n The Drive.Files.export<\/a> endpoint does all the work for us here. The Also notice the two lines about caching images \u2014 this is a special consideration we\u2019ll skip over for now, and revisit in detail in the next section.<\/p>\n\n\n\n Here\u2019s an example of a Google document<\/a> displayed externally as HTML<\/a> using this method.<\/p>\n\n\n Fetching Google Sheets is almost as easy using Spreadsheets.values.get<\/a>. We adjust the response object just a little bit to convert it to a simplified JSON array, labeled with column headers from the first row of the sheet.<\/p>\n\n\n\n The Example<\/strong>: this Sheet<\/a> is read and parsed in order to render custom HTML on this page<\/a>.<\/p>\n\n\n These two endpoints already get you very far, and forms the backbone of a custom CMS for a website. But, in fact, it only taps the surface of Drive\u2019s potential for content management. It\u2019s also capable of:<\/p>\n\n\n\n The only limits here are your creativity, and the constraints of the full Drive API documented here<\/a>.<\/p>\n\n\n As you\u2019re playing with the various kinds of queries that the Drive API supports, you may end up receiving a \u201cUser Rate Limit Exceeded” error message . It\u2019s fairly easy to hit this limit through repeated trial-and-error testing during the development phase, and at first glance, it seems as if it would represent a hard blocker for our Google Drive-CMS strategy.<\/p>\n\n\n\n This is where caching comes in \u2014 every time we fetch a new version<\/em> of any file on Drive, we cache it locally (aka server-side, within the Node.js process). Once we do that, we only need to check the version<\/em> of every file. If our cache is out of date, we fetch the newest version of the corresponding file, but that request only happens once per file version<\/em>, rather than once per user request. Instead of scaling by the number of people who use the website, we can now scale by the number of updates\/edits on Google Drive as our limiting factor. Under the current Drive usage limits on a free-tier account, we could support up to 300 API requests per minute. Caching should keep us well within this limit, and it could be optimized even further by batching multiple requests<\/a>.<\/p>\n\n\n The same caching method is applied to images embedded inside Google Docs. The Caching ensures two things: first, that we are being respectful<\/strong> of Google\u2019s API usage limits, and truly utilize Google Drive as a front end for editing and file management (what the tool is intended for), rather than leaching off of it for free bandwidth and storage space. It keeps our website\u2019s interaction with Google\u2019s APIs to the bare minimum necessary to refresh the cache as needed.<\/p>\n\n\n\n The other benefit is one that the users of our website will enjoy: a responsive<\/strong> website with minimal load times. Since cached Google Docs are stored as static HTML on our own server, we can fetch them immediately without waiting for a third-party REST request to complete, keeping website load times to a minimum.<\/p>\n\n\n Since all this tinkering has been in server-side Node.js, we need a way for our client pages to interact with the APIs. By wrapping the A series of See the full express.js file in the companion repo<\/a>.<\/p>\n\n\n For deployment to production, we can can run the Express server<\/a> alongside your existing static web server. Or, if it\u2019s convenient, we could easily wrap it in a Docker image:<\/p>\n\n\n\nhttps:\/\/www.googleapis.com\/auth\/drive.readonly<\/code> scope we might use is described as:<\/p>\n\n\n\n
Enter the \u201cService account\u201d<\/h4>\n\n\n
google-drive-cms-example@npm-drive-cms.iam.gserviceaccount.com<\/code><\/pre>\n\n\n\n
Creating a service account<\/h4>\n\n\n
The Google Drive API<\/h3>\n\n\n
driveAPI.js<\/code><\/a> we are constructing to handle all of our interactions with the API. The key difference from Google\u2019s sample is in the
authorize()<\/code> method, where we use an instance of
jwtClient<\/code> rather than the
oauthClient<\/code> used in Google\u2019s sample:<\/p>\n\n\n\n
authorize(credentials, callback) {\n const { client_email, private_key } = credentials;\n\n const jwtClient = new google.auth.JWT(client_email, null, private_key, SCOPES)\n\n \/\/ Check if we have previously stored a token.\n fs.readFile(TOKEN_PATH, (err, token) => {\n if (err) return this.getAccessToken(jwtClient, callback);\n jwtClient.setCredentials(JSON.parse(token.toString()));\n console.log('Token loaded from file');\n callback(jwtClient);\n });\n}<\/code><\/pre>\n\n\n
Node.js vs. client-side<\/h4>\n\n\n
credentials.json<\/code> file on the server, and loaded via
fs.readFile<\/code> inside of Node.js. It\u2019s also listed in the
.gitignore<\/code> to keep the sensitive keys out of source control.<\/p>\n\n\n
Fetching a doc<\/h4>\n\n\n
getDoc(id, skipCache = false) {\n return new Promise((resolve, reject) => {\n this.drive.files.export({\n fileId: id,\n mimeType: \"text\/html\",\n fields: \"data\",\n }, (err, res) => {\n if (err) return reject('The API returned an error: ' + err);\n resolve({ html: this.rewriteToCachedImages(res.data) });\n \/\/ Cache images\n this.cacheImages(res.data);\n });\n });\n}<\/code><\/pre>\n\n\n\n
id<\/code> we\u2019re passing in is just what shows up in your browsers address bar when you open the doc, which is shown immediately after
https:\/\/docs.google.com\/document\/d\/<\/code>.<\/p>\n\n\n\n
Fetching a sheet<\/h4>\n\n\n
getSheet(id, range) {\n return new Promise((resolve, reject) => {\n this.sheets.spreadsheets.values.get({\n spreadsheetId: id,\n range: range,\n }, (err, res) => {\n if (err) return reject('The API returned an error: ' + err);\n \/\/ console.log(res.data.values);\n const keys = res.data.values[0];\n const transformed = [];\n res.data.values.forEach((row, i) => {\n if(i === 0) return;\n const item = {};\n row.forEach((cell, index) => {\n item[keys[index]] = cell;\n });\n transformed.push(item);\n });\n resolve(transformed);\n });\n });\n}<\/code><\/pre>\n\n\n\n
id<\/code> parameter is the same as for a doc, and the new
range<\/code> parameter here refers to a range of cells to fetch values from, in Sheets A1 notation<\/a>.<\/p>\n\n\n\n
\u2026and more!<\/h4>\n\n\n
Caching<\/h3>\n\n\n
Handling images<\/h4>\n\n\n
getDoc<\/code> method parses the HTML response for any image URLs, and makes a secondary request to download them (or fetches them directly from cache if they\u2019re already there). Then it rewrites the original URL in the HTML. The result is that static HTML; we never use hotlinks<\/strong> to Google image CDNs. By the time it gets to the browser, the images have already been pre-cached.<\/p>\n\n\n
Respectful and responsive<\/h4>\n\n\n
Wrapping in Express<\/h3>\n\n\n
DriveAPI<\/code> into its own REST service, we can create a middleman\/proxy service which abstracts away all the logic of caching\/fetching new versions, while keeping the sensitive authentication credentials safe on the server side.<\/p>\n\n\n\n
express<\/code> routes, or the equivalent in your favorite web server, will do the trick, with a series of routes like this:<\/p>\n\n\n\n
const driveAPI = new (require('.\/driveAPI'))();\nconst express = require('express');\nconst API_VERSION = 1;\nconst router = express.Router();\n\nrouter.route('\/getDoc')\n.get((req, res) => {\n console.log('GET \/getDoc', req.query.id);\n driveAPI.getDoc(req.query.id)\n .then(data => res.json(data))\n .catch(error => {\n console.error(error);\n res.sendStatus(500);\n });\n});\n\n\/\/ Other routes included here (getSheet, getImage, listFiles, etc)...\n\napp.use(`\/api\/v${API_VERSION}`, router);<\/code><\/pre>\n\n\n\n
Bonus: Docker Deployment<\/h3>\n\n\n
FROM node:8\n# Create app directory\nWORKDIR \/usr\/src\/app\n# Install app dependencies\n# A wildcard is used to ensure both package.json AND package-lock.json are copied\n# where available (npm@5+)\nCOPY package*.json .\/\nRUN npm install\n# If you are building your code for production\n# RUN npm ci --only=production\n# Bundle app source\nCOPY . .\nCMD [ \"node\", \"express.js\" ]<\/code><\/pre>\n\n\n\n