This is the site for the beta version of Stork Search. Take me back to stable!

Build a Search Index

To start using Stork, you need to build a search index: a file that Stork can load to respond to search queries. To build a search index, you need to give Stork a configuration file that tells Stork all the documents you want indexed, along with some metadata about how to index those documents.

The Configuration File

A Stork configuration file describes all the documents that should be indexed. This configuration file can either be written in JSON or TOML.

Stork can read contents from the web, from the filesystem, or from inline within the configuration file.

basic.toml
[[input.files]]
title = "1: General Introduction"
contents = "After an unequivocal experience of the inefficiency of the subsisting federal government, you are called upon to deliberate on a new Constitution for the United States of America..."
url = "https://federalist.stork-search.net/1.html"

[[input.files]]
title = "2-5: Concerning Dangers from Foreign Force and Influence"
contents = "When the people of America reflect that they are now called upon to decide a question, which, in its consequences, must prove one of the most important that ever engaged their attention..."
url = "https://federalist.stork-search.net/2-5.html"

[[input.files]]
title = "6-7: Concerning Dangers from Dissentions Between the States"
contents = "The three last numbers of this paper have been dedicated to an enumeration of the dangers to which we should be exposed, in a state of disunion, from the arms and arts of foreign nations..."
url = "https://federalist.stork-search.net/6-7.html"

To build a search index from this configuration file, save the file to disk and pass it into the Stork command-line tool:

$ stork build --input basic.toml --output federalist.st

Testing your index

After writing a config file, you might want to test how well the search interface works before loading it onto your web site. Stork offers a test mode, where it will build your search index and load it into a simplified web interface so you can play with the search functionality while iterating on your configuration.

To test out your index, run:

$ stork test --index my-index.st

and open http://localhost:1612, the web page served by Stork.

File Formats

Today, Stork can automatically recognize and extract text from four types of files:

  1. Plain text files,
  2. SRT subtitle files,
  3. HTML files, and
  4. Markdown files.

Stork will automatically detect the file format by inspecting its file extension; however, if your file extension is non-standard (such as .mdx for a Markdown file), you can specify the format of any file in the configuration:

[[input.files]]
path = "federalist-1.mdx"
title = "1: General Introduction"
url = "https://federalist.stork-search.net/1.html"
filetype = "Markdown"

Additional Options

You can visit the Configuration File Reference to see the full list of acceptable configuration key-value pairs.

The Stork configuration file lets you control many aspects of how Stork indexes your content and how the search interface behaves, such as:

  • How frontmatter in a document should be parsed
  • Which HTML selectors in an HTML document should be indexed and which should be ignored
  • Which language the stemmer uses to stem each word in your corpus

Next steps

Was this page helpful?

Documentation Preferences

These options let you customize the documentation based on how you want to use Stork.

Installation Method

Configuration File Format

Document Source

© 2019–2023.

Stork is maintained by James Little, who's really excited that you're checking it out.

This site is open source. Please file a bug if you see something confusing or incorrect.

Logo art by Bruno Monts, with special thanks to the fission.codes team.
Please contact James Little before using the logo for anything.