Planning an Image Processing Pipeline for the Web

Audience
Graphics Formats
Naming Convention
Graphics Processing Pipeline
Git LFS
Solution Architecture
Final Thoughts

As I build this website I am quickly realizing how important it is to implement a quality graphics processing pipeline early. Up to this point, I’ve implemented minor automation using Makefiles and Bash scripts, however as the complexity of the site grows I am quickly realizing the need for a better tool. In this article, I present my thoughts on an improved processing pipeline to render graphics for this website.

Audience

Knowing who or what will be viewing graphics is the first step to developing the right graphics processing pipeline. As I continue to build this site and learn, it seems like consumers of graphics keep emerging from the ether.

First, there are humans. This may seem obvious - and it is - however there are a lot of nuances that need to be considered. For example, it is 2023 and I can expect the majority of users to be coming in on mobile devices. This means that I need to ensure raster image sizes and formats are appropriate for mobile users. Humans may consume some images in a photo gallery-style layout meaning that it is best to provide a smaller thumbnail for browsing and a larger image for focused viewing. This use case impacts the design of the graphics processing pipeline as well.

Additionally, I must consider complementing my accessibility goals with automation. My mom is blind and I love the idea of her being able to have a great experience when reading about what her son is up to. While it may seem like a non-issue when building a graphics processing pipeline, I can include a naming convention as part of my graphics processing pipeline that can provide hints to content in the rendered alt and title attributes.

Another audience to consider is services that support the Open Graph protocol such as Facebook, Twitter, and LinkedIn. To align my automation goal with Open Graph, I need to be able to generate compatible graphics with minimal input.

Finally, I must also consider other consumers such as Google when rendering their structured data format. Concerning the graphics processing pipeline, this will involve selecting the correct formats and rendering optimal graphics sizes.

Graphics Formats

To keep my list of supported devices as simple as possible I wanted to use as few graphics file formats as possible. To this end, I settled on SVG for vector graphics and webp for raster graphics. The SVG format is a long-time standard and is a supported export format from OmniGraffle. Generally, I prefer vector-based graphics for the site since I can scale and change color as needed by the theme.

Webp is an open standard developed by Google and released under the BSD. Overall the webp format is quite versatile supporting lossy+lossless compression, transparency, and animations. Moreover, the output size typically beats other popular formats which means faster download times and a better overall experience. The webp browser support is more than sufficient given that this site is designed to use modern tools and be viewed by modern devices.

Naming Convention

The naming convention I am planning to use in the updated graphics rendering pipeline needs to be simple, expressive, and scalable to new features. For this reason, I am using double underscore __ to separate fields in the filename. Future fields can be appended to the end and any parsers I write can assume that missing fields to the right will be populated with a default value.

For webp raster graphics, I am using the following naming convention:

name[_name...]__[type]__[full].webp

Since svg graphics are not associated with a specific size, I am using the following variation of the previous format:

name[_name...]__type.svg

Where:

name is typically a noun that describes what the graphic represents. If more than one name is present then they must be ordered from most general to most specific.
type contains flags about the image, for example, photo means the image is a photograph
full is a suffix at the end that indicates the raster image is full size

Some example graphics file names include:

tod_logo_60px.webp: 60px wide site logo
tod_logo_120px.webp: 120px wide site logo
family__p.webp: Small-sized photograph of my family
family__p__full.webp: Full-sized photograph of my family

Graphics Processing Pipeline

The actual implementation of the graphics processing pipeline will vary slightly from article to article, however, most of my graphic processing pipelines will have similar steps:

Sanitize filenames: Remove any whitespace and convert separator characters. This is to ensure compatibility with the graphics file naming convention.
Resize Small: For raster graphics, create a small version of the image. In my pipeline, this will preserve the aspect ratio while resizing up to a maximum width. Depending on the article this could be adjusting a full-size photograph down to a more optimal size or it could be resized down to a thumbnail to be displayed in a lightbox.
Resize Large: For raster graphics, this step will still resize the image while preserving the aspect ratio with a maximum width larger than the Resize Small step. Ideally, I would never have to resize a raster image up to a larger size since it will negatively affect the quality.
Custom: After the graphic filename has been sanitized I have an optional hook to execute my own custom rules.
Convert to webp: For raster graphics, this step will convert to the webp format while removing all image metadata. The image metadata removal step is important since I don’t want to inadvertently leak information about where I am over the public Internet.
Remove SVG Metadata: Similar to the previously mentioned step, vector graphics can also contain metadata that I do not want to inadvertently leak onto the public Internet. Implementation of this step is however quite different and involves parsing the svg XML.
Generate Open Graph Banner For articles in the given article or page, this step generates an open graph image that will represent the content when shared on social media.
Cleanup: Each step in the pipeline will output a post-process copy of the graphic to help with debugging. Once the pipeline executes successfully this step will simply remove temporary files.

Visually, the pipeline steps will look something like this:

flowchart TB

RENAME(["fas:fa-cog Sanitize Filenames"])
RESIZE_SMALL(["fas:fa-cog Resize to Small"])
RESIZE_LARGE(["fas:fa-cog Resize to Large"])
CUSTOM1(["fas:fa-cog Custom"])
CONVERT_WEBP(["fas:fa-cog Convert to webp,\n remove metadata"])
REMOVE_SVG_METADATA(["fas:fa-cog Remove SVG Metadata"])
OG_BANNER(["fas:fa-cog Generate Open Graph Banner"])
CLEANUP(["fas:fa-cog Cleanup temporary files"])


IF1{" "}
RENAME --Varies--> IF1
IF1 --Raster--> RESIZE_SMALL
IF1 --Raster--> RESIZE_LARGE
IF1 --Varies--> CUSTOM1
IF1 --Varies--> OG_BANNER

IF2{" "}
CUSTOM1 --Raster--> IF2
RESIZE_SMALL --Raster--> IF2
RESIZE_LARGE --Raster--> IF2
IF2 --Raster--> CONVERT_WEBP

IF3{" "}
IF1 --Vector--> IF3
CUSTOM1 --Vector--> IF3
IF3 --Vector--> REMOVE_SVG_METADATA

IF4{" "}
OG_BANNER--Raster-->IF4
CONVERT_WEBP--Raster-->IF4
REMOVE_SVG_METADATA--Vector-->IF4
IF4--Varies-->CLEANUP

As shown in the diagram below, the directory structure for each article or page source tree contains both source and rendered graphics. Once the final site is rendered only content in the img/ directory will be published:

flowchart TD

ROOT("fas:fa-folder Article Root/")
IMG("fas:fa-folder img/")
IMG_SRC("fas:fa-folder img_src/")

ROOT --- IMG_SRC
subgraph Pre-Process
SVG1("fas:fa-image *.svg")
JPG("fas:fa-image *.jpg")
PNG("fas:fa-image *.png")
HEIC("fas:fa-image *.HEIC")

IMG_SRC --- SVG1
IMG_SRC --- JPG
IMG_SRC --- PNG
IMG_SRC --- HEIC
end

TMP("fas:fa-folder tmp/")
ROOT --- TMP

ROOT --- IMG
subgraph Post-Process
SVG2("fas:fa-image *.svg")
WEBP("fas:fa-image *.webp")
IMG --- SVG2
IMG --- WEBP
end

INDEX("fas:fa-file index.html")
ROOT --- INDEX

Git LFS

Broadly, Git LFS (Large File Storage) is a Git extension designed for managing large files such as images, videos, and other binary files. There are many good reasons for me to use Git LFS to store graphics files including:

Git LFS only stores a pointer to the large file in the git repository, while the actual file is stored in a separate location. This saves disk space and makes it easier to manage large files in a git repository.
Git LFS reduces the amount of data that needs to be downloaded and uploaded during the git operation, which can improve performance and reduce the time it takes to clone or push/pull changes.

Taking advantage of my homelab, I am using Ceph S3 object storage as a LFS backend via the following Gitea configuration:

[lfs]
STORAGE_TYPE    = ceph_s3
MINIO_BASE_PATH = lfs/
SERVE_DIRECT    = false

[storage.ceph_s3]
MINIO_SECRET_ACCESS_KEY = SECRET
MINIO_USE_SSL           = true
MINIO_ENDPOINT          = s3.kube
MINIO_BUCKET            = gitea
MINIO_LOCATION          = MY_LOCATION
SERVE_DIRECT            = true
MINIO_ACCESS_KEY_ID     = SECRET
STORAGE_TYPE            = minio

Finally, in my repository itself, I am using the following .gitattributes file to instruct my local git client that graphics files are to use LFS:

*.heic filter=lfs diff=lfs merge=lfs -text
*.HEIC filter=lfs diff=lfs merge=lfs -text
*.webp filter=lfs diff=lfs merge=lfs -text
*.jpg filter=lfs diff=lfs merge=lfs -text
*.JPG filter=lfs diff=lfs merge=lfs -text
*.jpeg filter=lfs diff=lfs merge=lfs -text
*.JPEG filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text

Solution Architecture

The architecture of the graphics pipeline tool is a simple Python-based CLI tool bindings to the OS-provided Imagemagick and webp packages. When executed, the CLI will read in an config.py file in the article or page root folder which will define the processing steps and span a set of worker processes to execute the pipeline on grapgs in the ing_src/ directory. Graphics processing logic is implemented as an SDK, and steps are chained together via the specification in the config.py file:

flowchart LR

subgraph Article Directory
IMG_SRC("fas:fa-folder img_src/")
TMP("fas:fa-folder tmp/")
IMG("fas:fa-folder img/")
CONFIG("fas:fa-file config.py")
end

subgraph App
CLI("fas:fa-gear CLI")
MODULES("&lt;SDK>\nfas:fa-gear modules")
WORKER("&lt;Process>\nfas:fa-gear Worker 1..𝑛")
CLI --Load--> CONFIG
CLI --Spawn--> WORKER
MODULES --Execute--- WORKER

WORKER --Read--> IMG_SRC
WORKER --Write--> TMP
WORKER --Write--> IMG
end

OS_IMAGEMAGICK("&lt;OS Package>\nfab:fa-ubuntu ImageMagick")
OS_WEBP("&lt;OS Package>\nfab:fa-ubuntu webp")

PIP_CLICK("&lt;PyPi Package>\nfab:fa-python click")
PIP_WAND("&lt;PyPi Package>\nfab:fa-python wand")
PIP_PYWEBP("&lt;PyPi Package>\nfab:fa-python pywebp")

OS_IMAGEMAGICK --Binding--- PIP_WAND
OS_WEBP --Binding--- PIP_PYWEBP

PIP_CLICK --- App
PIP_WAND --- App
PIP_PYWEBP --- App

Dependencies include pywebp which is a Python binding for webp and wand which is a Python binding for ImageMagick. Finally, to maximize usability, the CLI leverages the click CLI framework.

The decision to use a Python-based config vs JSON or YAML warrants a short explanation. In my opinion, when configuration requires chaining together a dynamic list of actions, it is better to create a well-designed SDK and use code as configuration to implement the desired flow. If I decided to use a more conventional yaml-based configuration, I would inevitably need to write a parser that effectively turns the yaml file into a DSL. This creates problems for both users and maintainers. Users will need to learn the DSL syntax and have limited tool support vs having intelligent code completion and unit testing tools available for free when the configuration is pure Python. For maintainers, a custom DSL will require constant adjustment when new features are implemented and introduce a significant source of error. Moreover, as the DSL evolves maintainers need to be careful to not break existing configurations that were previously working, which are often difficult to test.

Final Thoughts

In this article, I introduced a design for my future graphics processing pipeline and covered several important considerations including: audience, naming conventions, architecture, and development environment considerations. Expect future articles on this topic as I implement the tool.