Skip to content
February 11, 2023

Building a browser extension

Everything I learned building an extension for Chromium browsers

Recently I worked on building a browser extension. It was unexplored territory for me, and the road was a bit bumpy. I wanted to share some of my findings, as most of the Stack Overflow discussions I found were outdated or slightly misleading. Especially with the transition from Manifest v2 to Manifest v3.

My target was Chromium-based browsers, so I can't speak for Firefox or Safari. Although Firefox supports the same WebExtensions APIs, I haven't tested it.

Stack

Here's a quick overview of the technologies I used:

  • Lit: Perfect fit this use-case. It's a lightweight library that allows you to write Web Components with ease.
  • TypeScript: No-brainer really. chrome-types were a massive help.
  • Storybook: Mandatory for development if you start coupling your app with browser extension APIs
  • Vite: Vite was used for two purposes. To provide a preview environment where I could test my extension, and for bundling the extension in library mode (only JS output, no HTML).
  • Vitest: Super fast, no-config test runner. If you have used Jest with TypeScript you probably understand how annoying it is. Vitest just works.

Types of browser extensions

There are a few different options when building an extension.

First, you can build a "Popup" extension. Something like "Save to Notion". These are small web apps that load in the sandbox environment of a popup window, located in the extensions toolbar. You can use all of the browser extension APIs available, and you have great flexibility. If you want to bring in a framework like React, you can do it without any issues. It's an isolated environment, where you can go wild.

Browser extension APIs are about accessing the state of Tabs, Cookies, Bookmarks, and more. Here's an overview of the available APIs.

Second, you can build an extension that simply injects a script (called content scripts) into the pages of your choosing. There you can manipulate the DOM. A common use case is something like Fantasy Football extensions, which just add more metadata, like points, and schedules to the usually boring official pages. In that case, things get trickier. You can't use the Browser APIs directly. You have to rely on a service worker to do the heavy lifting for you.

Also, the content script is loaded in the context of the page, so it's probably not a good idea to bring in a framework like React.

Lastly, you can have a combination of both, something like Grammarly or 1Password. You can have both your content script, but also a popup where you can prompt the user to tweak their settings

In my case, I had to create an inline call-to-action button, that opens a sidebar, so using the first popup approach wasn't viable. Everything that I'll be sharing here is based on the second approach.

React, Lit & conflicting styles

Right off the bat, I was thinking of building a React/Tailwind application. I quickly realized that since I can't use the popup route, I wouldn't run my application in a sandbox environment. I didn't want to load React (maybe a second version of it, if the page already does), or bother with conflicting styles.

Tailwind allows you to set a custom prefix. It helps with the same problem, albeit without 100% safety.

There's one elegant solution that handles both, and that's Web Components. Thankfully there's Lit where you can hit the ground running without any boilerplate. For a newbie to Web Components like me, it helped immensely.

Let's see a quick example of how a LitElement looks.

Here we have a simple component that handles the sidebar positioning. It doesn't have any other responsibility. And by using composition (slots) we render the content. Similar to what we would use {children} for in React

TypeScript
sidebar-content.ts
import {css, html, LitElement} from 'lit';
import {customElement, property} from 'lit/decorators.js';
 
import {noop} from '../utils/noop';
import {icons} from './icons';
 
@customElement('sidebar-container')
export class SidebarContainer extends LitElement {
  static styles = css`
    :host {
      /* */
    }
    .sidebar {
      /* */
    }
    .content {
      /* */
    }
    .close-button {
      /* */
    }
    .header {
      /* */
    }
    .list {
      /* */
    }
  `;
 
  @property({type: Function})
  onClose: VoidFunction = noop;
 
  render() {
    return html`<div class="sidebar">
      <div class="content">
        <div class="header">
          <button @click=${this.onClose} class="close-button">
            ${icons.xCircle}
          </button>
        </div>
        <div>
          <ul class="list">
            <slot></slot>
          </ul>
        </div>
      </div>
    </div>`;
  }
}

And here's an idea of how it would be used. You can pass reactive props, callbacks, or slotted elements, giving you all the tools you need.

TypeScript
some-root-component.ts
switch (this.state.status) {
  case 'loading':
    return html`<loading-state></loading-state>`;
  case 'unauthenticated':
    return html`<unauthenticated-state></unauthenticated-state>`;
  case 'error':
    return html`<error-state></error-state>`;
  case 'empty':
    return html`<empty-state></empty-state>`;
  case 'ready':
    return html`<sidebar-content .onClick=${this.handleClose}>
      ${this.data.map(
        (entry) => html`<li>
          <my-sidebar-row .entry=${entry}></my-sidebar-row>
        </li>`
      )}></sidebar-content
    >`;
}

Honestly, Web Components are the best option when writing browser extensions (of that kind at least). They're lightweight, well supported, solve the problem of conflicting styles, and they're easy to use.

I'm planning to write more about them in a future post, as I'm still learning about them. For now, I'll leave you with a link to the Lit documentation.

Posting messages from the content script to service worker

The main point of my application was to fetch data from a remote server. Unfortunately, we can't just use fetch from our injected script. We'll be slapped in the face by a CORS error. We have to create a service worker to do this for us.

First, we need to let our service-worker know that we want to fetch some data. To communicate with the service worker from the content script, we have to use the chrome.runtime.sendMessage API.

https://res.cloudinary.com/ds9pd4ywd/image/upload/v1675781730/blog-images/posts/building-browser-extension/content-script-setup_terc19.png

In the following example, I'm asking the service-worker to fetch some data the moment the component gets added to the DOM.

TypeScript
some-root-component.ts
@customElement('some-root-component')
export class SomeRootComponent extends LitElement {
  // ...
 
  connectedCallback() {
    super.connectedCallback();
 
    // On load, kindly ask the service-worker to fetch the data
    chrome.runtime.sendMessage(
      {type: 'fetchSomething', payload: {id: this.someId}},
      // and assign a callback to run when the service-worker responds
      (response: RequestState<MyResponse>) => {
        this._doSomething(response);
      }
    );
  }
}

And then we have the service worker lurking, waiting for the message.

TypeScript
service-worker.ts
import {MessageType} from '../types';
import {requests} from './requests';
 
type MessageSender = chrome.runtime.MessageSender;
 
// Listen on messages from the content script
chrome.runtime.onMessage.addListener(function (
  message: MessageType,
  _: MessageSender,
  sendResponse: (response: unknown) => void
) {
  if (message.type === 'fetchSomething') {
    requests
      .fetchSomething(message.payload.id)
      .then((response) => {
        sendResponse(response);
      })
      .catch(() => {
        sendResponse({status: 'error', error: 'UNKNOWN_ERROR'});
      });
  }
  return true; // respond async
});

The message can be anything. Maybe we ask to create a bookmark. It depends on the use case. All it matters is that for all messages, we have a handler in the service worker.

CORS & consistent extension key

Now let's talk about CORS. To allow CORS requests I had to make two tweaks:

  1. Explicitly state the remote-server URL in my manifest (under host_permissions in my manifest.json)
  2. Whitelist the extension in my CORS config on my server

Here's the problem. Every time you load an unpacked application, a new id is assigned to your extension. If you have beta users, or your users are side-loading the extension, you don't have a single origin to whitelist. The solution is assigning a unique extension id (key in manifest.json) to your extension.

Now to ensure that this is unique and no other extension will share the same key, we have to create a draft entry in the Chrome web store (and/or Edge store). Even if you don't plan to publish your extension, you have to do this.

After you've created a draft entry, we add the key to our manifest.json and we're good to go. Every-time you remove and re-add the extension, it will have the same id. This way we can safely whitelist a single chrome-extension:// URI.

Here's more about this approach

The same applies for the Edge store. In my extension I produce two builds extensions/edge & extensions/chrome each with different extension-ids in the manifest.json. This way I whitelist only these two origins in my backend.

Cookies & Authentication

Now, here's a question. How do you authenticate your user? We assume that we make a call to our server, but how do we ensure that the user is authenticated?

One approach is to prompt them to fill in their credentials. But that's not great UX. In my case, I had to use the same authentication flow as the main application. If a user is authenticated in the application, I want to reuse their cookie. I don't want to prompt them to log in again, just to use the extension.

Turns out there's a way to do this. First, you need to add the cookies permission in the manifest.json file, then ensure that your application's domain exists in the host_permissions array.

manifest.json
{
  "name": "My Extension",
  "version": "1.0",
  "manifest_version": 3,
  "description": "My Extension",
  "permissions": ["cookies"],
  "host_permissions": ["https://my-website.com"]
}

Now if you include cookies in your fetch/axios requests, they will be sent to the server. And if the user is authenticated, you'll get the same response as if you were using the application directly. This is a great way to reuse your authentication flow, without re-inventing the wheel. I was honestly surprised that this was possible.

Posting messages from the service worker to the content scripts

Alright, let's do the reverse now. How do you send a message from the service-worker to the content-script? Well, you can't.

Each tab will have its content-script. So if you want to send a message to a specific tab, you need to know the tab's id. And to get that piece of info, we need access to the tabs permission.

Check again our lovely example if you're confused.

https://res.cloudinary.com/ds9pd4ywd/image/upload/v1675781730/blog-images/posts/building-browser-extension/content-script-setup_terc19.png

Alright, let's add the tabs permission to our manifest.json file.

manifest.json
{
  "name": "My Extension",
  "version": "1.0",
  "manifest_version": 3,
  "description": "My Extension",
  "permissions": ["cookies", "tabs"],
  "host_permissions": ["https://my-website.com"]
}

Now, we can use the chrome.tabs.query API to query for any tabs that match our filters. And then we can use the chrome.tabs.sendMessage API to send a message to the content-scripts of those tabs.

Here's a little helper function that will send a message to all the tabs that match a specific URL pattern. It's a bit naive, but you get the idea.

TypeScript
service-worker.ts
function notifyAllContentScripts(message: MessageType, urlPattern: string) {
  chrome.tabs.query({url: urlPattern}, function (tabs) {
    for (const tab of tabs) {
      if (tab.id) {
        // send the message, ignore the callback
        chrome.tabs.sendMessage(tab.id, message, () => undefined);
      }
    }
  });
}

And finally, the component has to attach a listener to accept the message. Note that the chrome.runtime.onMessage && chrome.runtime.sendMessage are the only API's available to the content script.

TypeScript
some-root-component.ts
@customElement('some-root-component')
export class SomeRootComponent extends LitElement {
  // ...
 
  connectedCallback() {
    super.connectedCallback();
 
    chrome.runtime.onMessage.addListener((request: MessageType) => {
      if (request.type === 'some-action') {
        this._doSomething();
      }
    });
  }
}

Listening to cookies change

https://res.cloudinary.com/ds9pd4ywd/image/upload/v1675781719/blog-images/posts/building-browser-extension/cookie-flow_ktt2ou.png

Now let's combine some of the previous sections.

  1. We can reuse the cookies
  2. We can send messages from the content script to the service worker
  3. We can notify all content scripts from the service worker

So, imagine that the user is initially logged out, and we prompt them to log in. How would we implement a feature, where we would try to fetch data, after a successful login on another tab?

Thankfully there's the cookies.onChanged event that will notify us when a cookie changes.

Unfortunately I've found that the event is spamming quite a bit (maybe it's my login flow code). So I've added a debounce function to only keep the last call.
TypeScript
service-worker.ts
function onCookieChange(changeInfo: chrome.cookies.CookieChangeInfo) {
  // Cookie was added - weird API design
  if (!changeInfo.removed) {
    if (changeInfo.cookie.domain.includes(appHostname)) {
      notifyContentScripts({type: 'refreshArticle'}, myTargetUrlPattern);
    }
  }
}
 
// Debounce the function to avoid spamming the content scripts
chrome.cookies.onChanged.addListener(debounce(onCookieChange, 500));

Then we notify the appropriate content scripts, and they can do whatever they want. They can post back and ask the service worker to do something, or they can do some logic themselves.

Custom fonts

Ok, enough with the data flow. Let's add some custom fonts.

I read various approaches online, but I found that creating a font-script programmatically works just fine.

TypeScript
main.ts
const gf = document.createElement('link');
gf.href =
  'https://fonts.googleapis.com/css2?family=Roboto:wght@300;400;500;700&display=swap';
gf.rel = 'stylesheet';
document.body.appendChild(gf);

Maybe there's a strong reason to use web_accessible_resources instead, so I'll keep an eye on that. For now, if the user blocks Google Fonts, I'm happy with system fonts.

Reacting to outside events

One requirement I had was to react to outside events. For example, if the user clicks on another DOM element, I want to close my extension. I used this very simple snippet.

TypeScript
some-root-component.ts
@customElement('some-root-component')
export class SomeRootComponent extends LitElement {
  connectedCallback() {
    super.connectedCallback();
    // ... more
    window.addEventListener('click', this._handleClickOutside);
  }
 
  disconnectedCallback() {
    window.removeEventListener('click', this._handleClickOutside);
    super.disconnectedCallback();
  }
 
  private _handleClickOutside = (event: Event) => {
    if (!event.composedPath().includes(this)) {
      this._handleSidebarVisibility(false);
    }
  };
}

I'm unsure if attaching a listener with window is the best approach, so I'm still in the lookout for a better solution.

Development

As for development and testing, I went with Vite, Vitest & Storybook. Storybook is mandatory as the moment you add Browser extension APIs, you break Vite's development mode. Storybook supports Lit so it just works nicely. Here's the documentation on writing Lit elements in Storybook.

As for my very light Vite config, there's nothing special. Just picking the right entry points.

TypeScript
vite.config.ts
/// <reference types="vitest" />
import {defineConfig} from 'vite';
 
// https://vitejs.dev/config/
export default defineConfig({
  test: {
    globals: true,
  },
  build: {
    lib: {
      entry: ['src/client/main.ts', 'src/server/service-worker.ts'],
      formats: ['es'],
      name: 'MyExtension',
    },
  },
});

And here's the full content script entry point. The whole logic is in the some-root-component.ts file. The React App.ts equivalent.

TypeScript
main.ts
// Import the polyfills :(
import '@webcomponents/custom-elements';
// Import the web components
// import './components/a.ts';
// import './components/b.ts';
// import './components/c.ts'; etc...
import './components/some-root-component';
 
// Add to the page
document.body.appendChild(document.createElement('some-root-component'));
 
// Load the fonts
const gf = document.createElement('link');
gf.href =
  'https://fonts.googleapis.com/css2?family=Roboto:wght@300;400;500;700&display=swap';
gf.rel = 'stylesheet';
document.body.appendChild(gf);

Bundling

My final output is two folders, one extensions/chome and one extensions/edge. I'll omit posting my build script, but here's the gist of it:

  1. Build the Vite project, keep it in a temp folder
  2. Create two folders, one for Chrome and one for Edge
  3. Copy the Vite output to both folders
  4. Build the manifest.json file, merging my base config with the Chrome specific config
  5. Move it to the Chrome folder
  6. Build the manifest.json file, merging my base config with the Edge specific config
  7. Move it to the Edge folder

Publishing

I've used the Chrome Developer Dashboard to publish the extension. It's a simple process, but it will set you back 5$. The whole reviewing process took about 2 days, so no complaints there.

Conclusion

Building a browser extension isn't as intimidating as I thought. It has its quirks, and some unknowns due to the introduction of Manifest v3, but nothing that can't be overcome.

I specifically am proud of using Web Components and Lit on this project. It has been on my radar for so long, but never had the chance to use it. One possible extension would be to use Shoelace for styling, but I'll leave that for another day.

So there's that. A simple browser extension, built and published in a few days. I hope you found it useful.


Resources