GitHub Actions and Playwright to Generate Web Page Screenshots

Automating Web Page Screenshots with GitHub Actions and Playwright

You just wrapped up your web project, and now it’s time to make sure it looks great across devices and browsers. You could manually check every page, but let’s be real—who has time for that? Instead, let’s automate the process using GitHub Actions and Playwright to generate web page screenshots automatically. This setup will give you instant feedback on your website’s appearance and help you catch issues before they go live.

In this guide, I’ll walk you through the entire setup, from installing Playwright to configuring GitHub Actions. By the end, you’ll have a fully automated system that runs on every code push. Grab your coffee, and let’s dive in!

What Are GitHub Actions?

GitHub Actions is a built-in CI/CD (Continuous Integration and Continuous Deployment) tool that automates your development workflows. It allows you to run scripts in response to events like code pushes, pull requests, or scheduled triggers.

Why Use GitHub Actions?

  • Seamless Integration: It’s built into GitHub, so you don’t need any external setup.
  • Flexible Workflows: You can create custom automation with YAML.
  • Pre-Built Actions: A marketplace full of ready-to-use workflows.

Read more on GitHub Actions.

What Is Playwright?

Playwright, developed by Microsoft, is an automation framework for browser testing. It allows you to write scripts that simulate real-user interactions, and best of all, it works across multiple browsers.

Why Playwright?

  • Multi-Browser Support: Works with Chromium, Firefox, and WebKit.
  • Headless Mode: Run tests in the background without opening a browser.
  • Powerful API: Automate clicks, form fills, and even screenshots effortlessly.

Check out Playwright on GitHub.

Setting Up Your Project

Let’s get started by setting up a GitHub repository and configuring Playwright for screenshot generation.

Step 1: Create a New GitHub Repository

  1. Head to GitHub and create a new repository.
  2. Name it something like webpage-screenshot-automation.
  3. Clone the repo to your local machine.

Step 2: Install Playwright

Initialize a new Node.js project and install Playwright.

mkdir webpage-screenshot-automation && cd webpage-screenshot-automation
npm init -y
npm install playwright

Step 3: Write the Screenshot Script

Create a new file called screenshot.js and add the following code:

const { chromium } = require('playwright');

(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Navigate to the desired page
  await page.goto('<https://example.com>');

  // Capture a screenshot
  await page.screenshot({ path: 'screenshot.png' });

  await browser.close();
})();

Run the script to test it locally:

node screenshot.js

If everything is set up correctly, you should see screenshot.png in your project directory.

Automating with GitHub Actions

Now, let’s set up a GitHub Actions workflow that will run automatically whenever you push changes to your repo.

Step 1: Create a Workflow File

Inside your project, create .github/workflows/screenshot.yml and add the following:

name: Generate Screenshot

on:
  push:
    branches:
      - main

jobs:
  screenshot-job:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v2

      - name: Set Up Node.js
        uses: actions/setup-node@v2
        with:
          node-version: '14'

      - name: Install Dependencies
        run: npm install

      - name: Generate Screenshot
        run: node screenshot.js

      - name: Upload Screenshot Artifact
        uses: actions/upload-artifact@v2
        with:
          name: screenshot
          path: screenshot.png

Step 2: Push Your Changes

Commit and push everything to GitHub:

git add .
git commit -m "Set up Playwright screenshot automation"
git push origin main

Step 3: Check GitHub Actions

Go to the Actions tab in your GitHub repo. You should see the workflow running. Once completed, the screenshot will be available as an artifact.

⚠️ Playwright browser installation step can take long

Installing Playwright in GitHub Actions can take a significant amount of time because it downloads all supported browsers by default. If you only need a specific browser (e.g., Chromium), you can install it with:

npx playwright install chromium

Then, modify your script to use only the installed browser.

🏎️ Speeding Up Workflow with Caching

Since downloading and installing Playwright browsers happens on every workflow run, caching them can significantly speed up execution. Modify your workflow to cache the Playwright binaries:

      - name: Cache Playwright Browsers
        uses: actions/cache@v2
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}
          restore-keys: |
            playwright-${{ runner.os }}

This ensures that Playwright’s browser binaries are reused across runs, reducing setup time.

Wrapping Up

That’s it! You’ve successfully set up an automated workflow that generates web page screenshots every time you push your code. This can save you time and help catch layout issues before they reach production.

Next Steps

  • Modify the script to capture multiple pages.
  • Add email or Slack notifications.
  • Extend it to handle different screen sizes.

If you found this guide useful, share it with your fellow developers! Got questions? Drop them in the comments. 🚀

Google Sheets + Zapier is a perfect gateway for quick integrations when bootstrapping a new tool/service

When bootstrapping a new product, regardless of platform and solution is used in back-end and front-end, the time comes very quickly that you will need to integrate with 3rd party platforms to create continuity of the product’s user experience between different solutions.

As a good example of this, let’s say you bootstrapped a small SaaS product that helps users calculate their taxes. Your product is not going to be only the software solution you created but the whole experience from customer support to documentation or educational materials, perhaps some marketing experience when acquiring and onboarding your new users. So right off the bat, we will need a customer support solution, marketing tool. Perhaps a CRM-ish tool to use as our “customer master” database. And will want to channel everything there as much as we can.

But when someone signs up, your back-end only creates their user account, and their customer support record, CRM record, or marketing track is not connected. Most likely, these will be separate services like Intercom, Zendesk, Mailchimp, etc. And obviously, your own backend/database where your user’s initial records are created and their engagement with your core product happen.

I have planned and done these integrations many times over in different products and worked with many 3rd party services to integrate. Some niche solutions that I had to integrate don’t have proper APIs or capabilities. Setting some of these exceptions aside, most tools have integrations with well-known platforms like Salesforce, Facebook Ads, IFTTT, Slack. And as a common and growing theme, most tools also have integration with Zapier which is the main event I want to come to.

Eventually, I find myself evaluating Zapier Integrations between these platforms to cover most of the use cases we often spend days doing single integration. If the triggers and actions cover what we are trying to do, I started to suggest my clients and the rest of my team create Zapier focused integrations.

There is an easier way. A big majority of people working in the process/product/team management space use spreadsheets daily. Either Excel or Google Sheet covers that big majority of the use cases. I evangelize Google Sheets just because of its real-time collaboration and ease-of-access capabilities. It’s a free and large majority of people having google accounts making it very universal.

I have done direct google sheet integrations in the past many times. But recently I like the concept of using google sheet as a source that can be commonly used by other services for integration purposes. Since it’s a living document, it’s very easy to make changes on a document or listen to changes happening on documents (by human or APIs). This makes it an amazing candidate for using it with Zapier to use it as a “source” of data. It makes Zapier the magic glue here to serve as a universal adapter to anything else we want to connect to. Having thousands of services available in Zapier makes it a meeting ground for moving the data we provide through google sheet to anywhere else.

I need to say this will be limited based on each service’s capability and the available actions/triggers in the Zapier platform. But most SaaS solutions invest enough effort and time to make their Zapier integrations rich enough to serve the most common use cases. It won’t cover 100% of needs but it will certainly eliminate a lot of basic integrations like slack, email notifications, marketing tools triggers (i.e: follow-up campaigns).

This is not a code-less solution

When going down this route, the biggest work and challenge will be integrating Google Sheet APIs to connect your account (through the oAuth process), and store your credentials in your server and create the server → gsheet integration to send your back-end changes to a google sheet document. It’s not the easiest API to integrate with, but it’s well documented, mature, and has endless examples in the community (Github). And best of all, this one integration opens up so many without needing to do further integration. Even in the most basic products, we find ourselves doing slack, email deliveries in MVP versions. Investing the same effort in google sheet will easily justify itself later.

Trade offs

One big trade-off is to have your user’s PII data to be transported, stored in a google sheet (which will be private), and then sent to Zapier. If you are super paranoid or have to comply with certain privacy regulations, managing this traffic may need to be done more sensitively or completely unfeasible for your product. But the majority of products I built do not need that rigorous audit and compliance. So this solution has worked for me many times.

Example

I want to show a sample integration to set up a google sheet as a trigger and put a Slack notification as an action. Hopefully, this showcases some imagination and helps you understand where this can go.

Set up Google Sheet changes as “trigger”

Create a new zap or edit the existing one to change the “trigger” service. Select Google Sheets.
In the first step, you will be asked to select the google account linked to your zapier account. If you haven’t done it yet, or want to connect to another account than you currently have, you can do it in this step.

Group 1.jpg

 After selecting the account, Zapier will ask you to select what event you want to set this zap to listen to. Generally, we will inject a new row into a sheet in one of the documents. So we select “New Spreadsheet Row” as the event to listen to, but as you can see, you can select other events like updating a spreadsheet row or new worksheet creation in a document.

Screen Shot 2021-04-28 at 6.11.jpg

 Now you will need to select which document and which worksheet to listen to. Zapier will show document and sheet selection dropdowns here.

Screen Shot 2021-04-28 at 6.11-1.jpg

 As the final step, you will be able (and kinda have to) to test your trigger that will pull a sample row from your sheet. Make sure you enter values into your columns to use this sample data to set up your further actions in zapier. Zapier will show these sample values when you create actions using these values.

Group 3.jpg

Set up Slack as “action” to send a message to a channel

Now, we’ll use this trigger in any service we want. We can also create multiple actions where you can send an email and slack notification and create a new Intercom customer record at the same time in one zap.

For this example, in the “action” section we will select Slack service when asked.

First, we will select the type of “action” we want to perform. We will select “Send Channel Message”. You can select other actions like send a direct message or others.

Screen Shot 2021-04-28 at 6.09.jpg

Then, similar to Google sheet initial steps, we will first select the slack account we want to use.

Group 5.jpg

And finally, with seeing a lot of options, we will set up the sender name, avatar and other details, but most importantly, the channel we want the message to be sent to and the message content itself:

Group 4.jpg

 Zapier is pretty intuitive and simple to construct the smart content areas like this one. You will be able to both type a static message as well as insert the actual data (variables) from your source. In this example, our source is the google sheet document. So you will see a dropdown with search capabilities to search and find the actual column value you want to insert when you want to construct a message with dynamic parts.

Once everything is done, you will be able to finish this step and be forced to test the action you just set up. And all done! Don’t forget to turn the zap “on”.

This is just the most simple example I can use. There are many use cases you can allow this integration to push changes/data into thousands of services available in Zapier.

Happy Zaps!

Web, UI and browser automation with headless browsers

Wanted to give you a short information about browser automation. You visualize a desktop app when a “browser” comes to minds right? All browsers use an engine to render web pages on our screen. And these engines can actually work without rendering pages in the UI. All they need is to render the elements in memory. From there, it can allow us to interact with rendered pages programmatically without displaying the rendered page on our screens. There are browsers only works in this mode and they are called “headless” browsers. Means they have no UI. This browsers are meaningless for general consumers but it comes very handy to developer and testing community. Many companies build their testing and QA process utilizing these headless browser, do execute their UI flows with browser automation scripts. For instance, headless browser can be programmed to run and simulate the following user experience flow:

  • Load http://example.com web page,
  • Wait until page is completely rendered including javascript and css,
  • Fill “Fatih” to the field called “Name”,
  • Click to the button called “Send”,
  • Wait 5 seconds,
  • Take screenshot and save as JPEG

This can be very useful when doing regression tests.

Event utilizing screenshots with headless browsers will be very useful. There are many companies doing screenshot comparison for high level understanding changes done visually based on any given iteration on the code. This process simply takes and keeps versions of each page and in every release, it takes a new one with latest version and compares the pixels (and colors) to the previous version to determine a percentage for the change it detects. Then you can set some report and process to make sure you track of big changes to detect if a tiny css change blew a page you usually don’t test manually. It becomes very meaningful when you think about a web page with 100 different pages.

Are there any headless browsers I can use?

The well known headless browsers; the one named “Phantom” (and phantomjs) that is big on nodejs community. There is also headless chrome which is based on chromium.

There is an extensive list of all headless browsers out there here: https://github.com/dhamaniasad/HeadlessBrowsers

Happy browser automations 🙂