This is the site for the beta version of Stork Search. Take me back to stable!

Stork Configuration File Reference

Just getting started? Learn how to build a search index

A Stork configuration file is a TOML file or JSON file that you pass into the Stork build command. This file defines the way your index is created and processed, and also controls some aspects of how your search results are displayed.

$ stork build --input my-config-file.toml --output my-index.st

The configuration file parser relies heavily on intuitive default values: if a field is inapplicable to your search index (or if you're happy with the listed default value), you can leave the field out of your configuration file.

Input options

Input options define the list of documents that should be read and indexed, as well as the way those files are processed.

[input]
base_directory = "/my-files"
url_prefix = "https://example.com/"
title_boost = "Large"
stemming = "Spanish"
frontmatter_config = "Ignore"
# See below:
# html_config = ...
# srt_config = ...

[[input.files]]
# ...

HTML Configuration

The HTML configuration object defines how text should be extracted from HTML documents.

[input.html_config]
save_nearest_id = true
title_selector = "h1.page-title"
included_selectors = ['article']
excluded_selectors = ['pre', 'aside']

SRT Configuration

The SRT configuration object defines how URLs are generated from timestamp information embedded in SRT subtitle files.

[input.srt_config]
timestamp_template_string = "#t={}"
timestamp_format = "MinutesAndSeconds"

Files

Each document that is indexed need to be defined in the configuration.

Documents can be read from the filesystem, from the web, or from within the configuration file itself.

Within the file object, you can override some of the global configuration objects that you defined in the input section.

# This syntax adds a new element to the `input.files` array.
# https://toml.io/en/v1.0.0#array-of-tables
[[input.files]]
title = "1: General Introduction" # Required
url = "https://federalist.stork-search.net/1.html" # Required

# One of the following 3 is required:
path = "general-introduction.txt"
src_url = "https://federalist.stork-search.net/1.html" # Can be omitted if it's the same as `url`
contents = "After an unequivocal experience of the inefficiency of the subsisting federal government, you are called upon to deliberate on a new Constitution for the United States of America..."

filetype = "Markdown"

# The below options all override the previously-specified input options.
stemming = "French"
html_config = {title_selector = "h1.custom-page-title", included_selectors = ['article', 'aside'], excluded_selectors = ['pre']}
srt_config = {timestamp_format = "NumberOfSeeconds"}
frontmatter_config = "Omit"

Output options

Output options define the behavior of the indexer.

[output]
minimum_query_length = 2
break_on_first_error = true
Was this page helpful?

Documentation Preferences

These options let you customize the documentation based on how you want to use Stork.

Installation Method

Configuration File Format

Document Source

© 2019–2023.

Stork is maintained by James Little, who's really excited that you're checking it out.

This site is open source. Please file a bug if you see something confusing or incorrect.

Logo art by Bruno Monts, with special thanks to the fission.codes team.
Please contact James Little before using the logo for anything.