multilingual search for jekyll docs with pagefind
Why Search Matters in Documentation
As your documentation grows, structured navigation alone becomes insufficient. Users often prefer searching directly for keywords rather than clicking through menus. This is especially true for developer documentation or multilingual platforms where finding localized information quickly is crucial.
GitHub Pages doesn’t allow server-side functionality, which rules out traditional search engines like Elasticsearch or Algolia unless you host externally. That’s where client-side solutions like Pagefind come in.
What is Pagefind
Pagefind is a JavaScript-based search engine that runs entirely in the browser. It indexes your site at build time and generates lightweight WASM+JSON bundles for quick client search. It supports multiple languages, versioning, and section-level granularity out of the box.
Why Use Pagefind with Jekyll
- No need for external APIs or keys
- Works entirely with static hosting like GitHub Pages
- Supports localization and multiple content roots
- Easy integration with Liquid and collections
Installing Pagefind in a Jekyll Project
First, install Pagefind locally:
npm install -D pagefind
Then, modify your build pipeline to run Pagefind after Jekyll builds your site:
jekyll build
npx pagefind --source _site
Setting Up Output Directory
This creates a _site/pagefind/ folder containing:
pagefind.js: search engine clientwasm/: WebAssembly moduleindex/: compressed search indexes
Adding the Search UI
Add the following to your Jekyll layout (typically in _layouts/default.html):
{% raw %}
{% endraw %}
This provides a functional search bar and result list that works instantly on all built pages.
Indexing Multilingual and Versioned Content
Pagefind supports multiple content roots. This is useful if you have folders like /en/, /es/, or versioned paths like /v1/, /v2/.
Configure Multiple Roots
npx pagefind --source _site/en --bundle-dir _site/pagefind/en
npx pagefind --source _site/es --bundle-dir _site/pagefind/es
Then, dynamically load the correct index bundle in your layout based on the page language:
{% raw %}
{% endraw %}
Customizing What Gets Indexed
By default, Pagefind indexes the entire body of each HTML page. You can fine-tune this with data-pagefind-ignore and data-pagefind-body attributes.
Exclude Navigation or Footers
<nav data-pagefind-ignore>...</nav>
<footer data-pagefind-ignore>...</footer>
Prioritize Specific Sections
<div data-pagefind-body>
<h2>Introduction</h2>
<p>This section will be indexed...</p>
</div>
Handling Language Labels in Results
Pagefind supports custom metadata fields. You can include language or version info inside the page front matter or HTML meta tags:
---
title: "Install Guide"
lang: en
version: v2
layout: doc
---
Then ensure these values are present as <meta> tags in your HTML layout:
<meta name="pagefind:lang" content="{{ page.lang }}" />
<meta name="pagefind:version" content="{{ page.version }}" />
These will be shown in search results and help users distinguish between similar entries in different contexts.
Case Study: Supporting Global Contributors in Open Source Docs
A multilingual open-source project received contributions in English, Portuguese, and Japanese. Previously, users had to manually browse language folders to find content.
Search Pain Points
- Users struggled to find translated tutorials
- Duplicate pages created confusion in search engines
- No unified navigation or search system
How Pagefind Helped
- Indexed all language folders separately with proper bundle paths
- Used Liquid logic to dynamically load the correct index bundle
- Embedded metadata for
langandversionfiltering
This reduced user bounce rate and improved page views per session across all language versions.
Performance Considerations
Since everything is static, search indexing runs during the build phase, not in the browser. The client only downloads compressed JSON files relevant to the query.
- Initial load adds ~100KB
- Search queries return results within 50-100ms
- No need for backend maintenance or quotas
Best Practices for Jekyll + Pagefind Integration
- Split search index by language/version for speed and clarity
- Use
data-pagefind-ignoregenerously on layouts and navigation - Show contextual metadata (version, language) in search results
- Keep your Jekyll permalink structure consistent across translations
Conclusion
Pagefind makes powerful search possible even on static sites like Jekyll hosted via GitHub Pages. With proper configuration, you can support fast, language-aware search without third-party dependencies or bloated plugins. Combined with YAML-driven navigation and multilingual data handling, Pagefind completes the toolkit for building a scalable, user-friendly documentation system.
In the next article of this series, we’ll look at how to create reusable documentation blocks using includes, components, and parameterized content to reduce duplication and improve maintainability.
