Unbundle that giant OpenAPI file!
Last updated: Mar 22, 2024
Besides the increased writer comfort, this makes it easy to compose multiple API definitions from the same underlying schema.
Awesome! 10,000 lines of YAML boilerplate
Have you ever said “I love reading 10k-line YAML files?” If you have, I suspect you’re a robot.
Because if you’re a human, there’s a good chance you don’t love reading 10k-line YAML files. And if you’ve ever worked on an large OpenAPI spec, there’s also a good chance you’ve gotten lost, at least once, in thousands of lines of YAML spaghetti.
Fortunately, it doesn’t need to be that way. Because many parsers support multi-file references, you can unbundle the YAML file into individual pieces.
Even if the parser you use looks in only one file, it’s simple to take a multi-file definition and bundle it into a single megafile, sure to please a hungry machine.
Unbundling solves multiple documentation issues
In a system with large or multiple REST APIs, there are some common documentation problems that come up:
- The organization might need separate public and private versions.
- The organization might want to present multiple APIs that have a large overlap in their underlying schema.
- If the definition is hand-written, the writer might be sad about working with a YAML mega-file.
- Maybe, if the writer uses the same hardware that I use, a mega-YAML file might even cause their text editor to lag and crash.
Fortunately, all these problems have one solution:
- Split the definition up into smaller files
Redoc to the rescue
Redoc is pretty well-known for their API docs.
Besides their redoc-cli
tool, they also make openapi
,
a command-line utility that bundles, unbundles, previews, and lints docs.
This demo uses openapi
. The topic focuses on the bundling and unbundling, but the other features are nice too.
Tutorial: Unbundle a large YAML file
In this demo, I’m going to:
- Inspect an
openapi.yaml
file - Use the
openapi
CLI tool to unbundle it into small pieces. - Separate the API into two top-level definitions.
- Rebundle the API as two distinct documents.
This demonstration should be reproducible.
If you can use the command line, feel free to follow along.
You’ll need to use npm
to install the openapi
tool (linked in the preceding section).
Find a large file
First we need to find a heavy-duty API spec. The OpenAPI directory is a good place to look.
I’m going to choose an API from NASA. At 373 lines, it’s a flyweight in the world of OpenAPI definitions. But it’s enough for a demonstration.
- Download your API file. You can do it with
cUrl
like this:
curl -o openapi.yaml https://raw.githubusercontent.com/APIs-guru/openapi-directory/main/APIs/nasa.gov/asteroids%20neows/3.4.0/openapi.yaml
- Inspect its contents. This API is quite slim. It has just a few endpoints:
/api
. Calling this endpoint returns the OpenAPI specification (recursive!)./api/projects
. This lists all projects/api/projects/{id}
. This returns information about a specific project.
Let’s pretend this mega API file has two audiences:
- Tech writers, who are interested in only OpenAPI YAML files.
- Project managers, who are interested in only projects
For this demonstration, I’m going to create two different API definitions. This way we can have a special dedicated API for the tech writers, and another one for the project managers.
Unbundle the file
- Make sure your file is in a clean directory.
.
└── openapi.yaml
0 directories, 1 file
- Create a directory for the output. I call mine
unbundled
.
mkdir unbundled
.
├── openapi.yaml
└── unbundled
1 directory, 1 file
- Run the
openapi split
command.
openapi split openapi.yaml --outDir unbundled/
Document: openapi.yaml is successfully split
and all related files are saved to the directory: unbundled/
openapi.yaml: split processed in 134ms
This tool is fast, even on very big files.
- Explore the new contents of the
bundled
directory. There are many more files now, one for each path, and one for each reusable schema.
.
├── openapi.yaml # The original file
└── unbundled # the unbundled directory
├── components # The re-usable schema
│ └── schemas
│ ├── closeoutDocument.yaml
│ ├── coInvestigator.yaml
│ ├── destination.yaml
│ ├── file.yaml
│ ├── libraryItem.yaml
│ ├── organization.yaml
│ ├── principalInvestigator.yaml
│ ├── programDirector.yaml
│ ├── programManager.yaml
│ ├── projectManager.yaml
│ ├── project.yaml
│ ├── technologyArea.yaml
│ └── workLocation.yaml
├── openapi.yaml #the new, top-level definiton
└── paths #the paths
├── api@projects{.format}.yaml
├── api@projects@{id}{.format}.yaml
└── api.yaml
4 directories, 18 files
Split the definition into two APIs
We’re going to make two APIs, one for tech writers, and one for project managers.
- Inspect the
unbundled/openapi.yaml
file. It now has only 50 lines. That’s because the schemas and endpoints are tucked away in their own files. Thepaths
property now looks like this:
paths:
/api:
$ref: paths/api.yaml
'/api/projects/{id}{.format}':
$ref: 'paths/api@projects@{id}{.format}.yaml'
'/api/projects{.format}':
$ref: 'paths/api@projects{.format}.yaml
If you are familiar with the $ref
keyword, this should look familiar—it’s
the same principle, but instead of referencing a position in the file, you’re now referencing a directory.
To make two API definitions, copy the top-level definition to another file. Rename both files, if you want.
cp unbundled/openapi.yaml unbundled/writersAPI.yaml
mv unbundled/openapi.yaml unbundled/projectsAPI.yaml
Open
writersAPI.yaml
. Delete the two/projects
paths, and their references.Open
projectsAPI.yaml
. Delete the/api
path, and its reference.
That’s it! Now you have two separate API definitions, which share two underlying schema. Furthermore, your specification is slim and DRY. You won’t have to wrangle any YAML monsters, but if different paths use the same schema, you can update multiple definitions by updating only one schema.
Rebundle the file
Some parsers can handle multi-line definitions. Redoc is an example (I guess that’s not a surprise).
Others can read only one file. If that’s the case for you, you can still work off an unbundled file. But you’ll just need to bundle it when you’re done.
- To keep the directory neat, make a directory for bundles.
mkdir bundled
- Use the
openapi bundle
command to create new, single-file definitions from your multi-file definitions.
openapi bundle unbundled/writersAPI.yaml -o bundled/BundledWritersAPI.yaml
openapi bundle unbundled/projectsAPI.yaml -o bundled/BundledProjectsAPI.yaml
Now you have all kinds of API definitions. They don’t all contain the same information, but they are all made from the same source files.
.
├── bundled
│ ├── bundledProjectsAPI.yaml
│ └── bundledWritersAPI.yaml
├── openapi.yaml
└── unbundled
├── components
│ └── schemas
│ ├── closeoutDocument.yaml
│ ├── coInvestigator.yaml
│ ├── destination.yaml
│ ├── file.yaml
│ ├── libraryItem.yaml
│ ├── organization.yaml
│ ├── principalInvestigator.yaml
│ ├── programDirector.yaml
│ ├── programManager.yaml
│ ├── projectManager.yaml
│ ├── project.yaml
│ ├── technologyArea.yaml
│ └── workLocation.yaml
├── paths
│ ├── api@projects{.format}.yaml
│ ├── api@projects@{id}{.format}.yaml
│ └── api.yaml
├── projectsAPI.yaml
└── writersAPI.yaml
5 directories, 21 files
Where to go from here?
I’ve written a few OpenAPI definitions by hand. At the beginning, it usually is easier to just write in one file. At some point, though, that file is going to get unwieldy. At that time, think about unbundling.
Besides writer comfort, unbundled definitions are extremely handy for creating multiple API definitions from a single source. For example, instead of “Writers” and “Projects,” you might have “Internal” and “External”.
You can also use a CI to automate the bundling process. I had one client with public and private API documents. Their public docs were built using a tool that didn’t support multi-file definitions.
Using Github actions, I made a “script” that:
- Bundled the file in the internal repo
- Sent the bundled file to a public repo, where it was turned into documentation.
More than a year later, I see the public docs are still getting automatically updated. Nice!