Skip to main content

Documentation Index

Fetch the complete documentation index at: https://developers.openalex.org/llms.txt

Use this file to discover all available pages before exploring further.

The search parameter finds results matching a given text search. Search requests cost $1 per 1,000 calls (vs. $0.10 per 1,000 for list+filter requests). See pricing.
# Works with "dna" in title, abstract, or fulltext
https://api.openalex.org/works?search=dna

What gets searched

Each entity type searches different fields:
EntitySearchable fields
Workstitle, abstract, fulltext
Authorsdisplay_name, display_name_alternatives
Sourcesdisplay_name, alternate_titles, abbreviated_title
Institutionsdisplay_name, display_name_alternatives, display_name_acronyms
Topics/Keywordsdisplay_name, description

Text processing

OpenAlex uses stemming and removes stop words to improve results:
  • Words like “the” and “an” are removed
  • A search for “possums” also returns “possum”
  • Searches match whole words only (“lun” won’t match “lunar”)
Use search.exact to search without stemming:
https://api.openalex.org/works?search.exact=surgery
Only one search parameter is allowed per request: search, search.exact, or search.semantic.
Use AND, OR, NOT (uppercase) for complex queries. Surround phrases with double quotes for exact matching:
# Works about "elmo" AND "sesame street" but NOT "cookie" or "monster"
https://api.openalex.org/works?search=(elmo AND "sesame street") NOT (cookie OR monster)
Words not separated by boolean operators are treated as AND.

Large Boolean queries

Because the query travels in the request URL, the whole request URL is limited to about 4 KB (roughly 4,000 characters). A very long search= value — typically a Boolean query with many OR terms, common in systematic reviews — can exceed this and return a 400 error:
{
  "error": "Request URL too long",
  "message": "Your request URL is 4612 bytes, over the 4094-byte limit (roughly 4 KB, mostly the 'search' value). Split a large Boolean query into smaller chunks, request each separately, and combine the returned IDs client-side."
}
This is a fixed limit, not a usage/credit cost — splitting the query does not lose any results. To run a query larger than ~4 KB, split the OR list into chunks, request each chunk separately, and take the union of the returned IDs client-side. The result is exactly the same set of works:
# Instead of one query that is too long:
search=(termA OR termB OR ... OR termZ) AND climate

# Split the OR list and union the IDs from each request:
search=(termA OR ... OR termM) AND climate
search=(termN OR ... OR termZ) AND climate
Keep each chunk’s full request URL under ~4 KB. Combining results is exact because (X AND (a OR b OR c OR d)) equals (X AND (a OR b)) ∪ (X AND (c OR d)). Putting api_key and mailto in request headers instead of the URL frees a little extra room, but for a large Boolean query you will still need to split it.
Each chunked request is billed independently, just like any other request. Splitting is only needed to stay under the URL length limit — it is not a way to reduce cost.

Phrase matching

Use double quotes to search for an exact phrase. Multi-word searches without quotes rank results higher when words appear close together.
https://api.openalex.org/works?search="fierce creatures"
Append ~N to a quoted phrase to find the words within N positions of each other, without requiring an exact match:
# "climate" and "change" within 5 words of each other
https://api.openalex.org/works?search="climate change"~5

Wildcards

Use * to match zero or more characters and ? to match exactly one character:
  • Trailing wildcard: machin* matches “machine”, “machines”, “machinery”
  • Single-character wildcard: wom?n matches “woman” and “women”
The search term must have at least 3 characters before the wildcard. Leading wildcards (e.g., *ology) are not supported.
https://api.openalex.org/works?search=machin*
Append ~N to a term to allow up to N character edits (insertions, deletions, or substitutions). The edit distance N can be 0, 1, or 2:
# Matches "machine", "machin", and other close variants
https://api.openalex.org/works?search=machin~1
The search term must have at least 3 characters before the ~. This is useful for catching typos and spelling variations.

Relevance score

Search results include a relevance_score property and are sorted by it (descending) by default. The score is based on:
  • Text similarity to your search term
  • Citation count (more cited = higher score)
If you want to match by meaning rather than keywords — or if you’re searching with a long input like an abstract or grant description — semantic search is a better fit. It uses AI embeddings to find conceptually related works even when the wording differs.
https://api.openalex.org/works?search.semantic=machine learning in healthcare
See the Semantic Search guide for examples, supported filters, and limits.
Deprecated. The filter=field.search: syntax still works but is no longer recommended. Use the search query parameter instead.
The .search filter suffix searches a specific field rather than all searchable fields at once:
# Authors with "Einstein" in their name (deprecated)
https://api.openalex.org/authors?filter=display_name.search:einstein

# Works with "cubist" in the title only (deprecated)
https://api.openalex.org/works?filter=title.search:cubist
The default.search filter is equivalent to the search query parameter. Variants like .search.exact (unstemmed) and .search.no_stem also exist for some fields. These filter-based searches cost the same $1 per 1,000 requests. For autocomplete use cases, use the autocomplete endpoint instead. raw_author_name.search finds works where someone appears by a specific name-as-published — exactly the string in the byline of the paper. It’s the one filter-based search that does not have a search parameter equivalent, so it stays the right tool for matching a person to their works.
It matches across all author names on the work, not within a single byline. Each unquoted token can match a different author. A search for raw_author_name.search:john smith returns a work whose authors are John Doe and Alice Smith — neither person is named “John Smith,” but the tokens appear somewhere in the byline list.
To scope a search to a single person (one byline), wrap the name in double quotes — the query becomes a phrase match against one raw_author_name at a time:
# Work-scoped: "john" and "smith" appear in *any* bylines (93,304 hits)
https://api.openalex.org/works?filter=raw_author_name.search:john smith

# Byline-scoped: works with an author named "John Smith" (2,727 hits)
https://api.openalex.org/works?filter=raw_author_name.search:"john smith"

Allow middle names and initials with proximity (~N)

Append ~N to a quoted name to allow up to N intervening tokens. This catches middle names and middle initials without dropping the byline scope:
# "Jane Smith" only — 1,661 hits
https://api.openalex.org/works?filter=raw_author_name.search:"jane smith"

# Also matches "Jane M Smith" — 3,779 hits
https://api.openalex.org/works?filter=raw_author_name.search:"jane smith"~1

# Also matches "Jane Marie Smith" — 4,571 hits
https://api.openalex.org/works?filter=raw_author_name.search:"jane smith"~2
Wider slop also picks up co-author noise on common names — start narrow and widen only when you need more recall.

Handle name variants with OR

The filter value supports Lucene OR between quoted phrases. Use one OR’d query (not multiple raw_author_name.search: clauses — those AND together and return nothing) to combine name forms:
# Comma-reversed byline ("Priem, Jason" → "Priem Jason")
https://api.openalex.org/works?filter=raw_author_name.search:"jason priem" OR "priem jason"

# First-initial form
https://api.openalex.org/works?filter=raw_author_name.search:"j priem" OR "priem j"
For a step-by-step walkthrough that uses these patterns to audit and correct an author profile’s works, see the Audit an Author Profile’s Works recipe.