API Reference

Complete API reference for the pubmed_client module.

Python bindings for PubMed and PMC API client

This module provides a high-performance Python interface to PubMed and PMC APIs for retrieving biomedical research articles.

Main classes:

Client: Combined client for both PubMed and PMC PubMedClient: Client for PubMed metadata PmcClient: Client for PMC full-text articles ClientConfig: Configuration for API clients

Examples

>>> import pubmed_client
>>> client = pubmed_client.Client()
>>> articles = client.pubmed.search_and_fetch("covid-19", 10)
>>> for article in articles:
...     print(article.title)
class pubmed_client.ClientConfig

Bases: object

Python wrapper for ClientConfig

Configuration for PubMed and PMC clients.

Examples

>>> config = ClientConfig()
>>> config.with_api_key("your_api_key").with_email("you@example.com")
>>> client = Client.with_config(config)
with_api_key(api_key)

Set the NCBI API key for increased rate limits (10 req/sec instead of 3)

with_cache()

Enable default response caching

with_email(email)

Set the email address for identification (recommended by NCBI)

with_rate_limit(rate_limit)

Set custom rate limit in requests per second

with_timeout_seconds(timeout_seconds)

Set HTTP request timeout in seconds

with_tool(tool)

Set the tool name for identification (default: “pubmed-client-py”)

class pubmed_client.Affiliation

Bases: object

Python wrapper for Author affiliation

address
country
department
email
institution
class pubmed_client.Author

Bases: object

Python wrapper for Author

affiliations()

Get list of affiliations

email
full_name
given_names
initials
is_corresponding
orcid
roles()

Get list of roles/contributions

suffix
surname
class pubmed_client.PubMedArticle

Bases: object

Python wrapper for PubMedArticle

abstract_text
article_types()

Get article types

author_count
authors()

Get list of authors

doi
issn
issue
journal
journal_abbreviation
keywords()

Get keywords

language
pages
pmc_id
pmid
pub_date
title
volume
class pubmed_client.RelatedArticles

Bases: object

Python wrapper for RelatedArticles

related_pmids
source_pmids

Bases: object

Python wrapper for PmcLinks

pmc_ids
source_pmids
class pubmed_client.Citations

Bases: object

Python wrapper for Citations

citing_pmids
source_pmids
class pubmed_client.DatabaseInfo

Bases: object

Python wrapper for DatabaseInfo

build
count
description
last_update
menu_name
name
class pubmed_client.CitationQuery(journal, year, volume, first_page, author_name, key)

Bases: object

Input for a single citation match query

Used with the ECitMatch API to find PMIDs from citation information (journal, year, volume, page, author).

Examples

>>> query = CitationQuery(
...     journal="proc natl acad sci u s a",
...     year="1991",
...     volume="88",
...     first_page="3248",
...     author_name="mann bj",
...     key="Art1",
... )
author_name
first_page
journal
key
volume
year
class pubmed_client.CitationMatch

Bases: object

Result of a single citation match from the ECitMatch API

journal

Journal title from the query

year

Year from the query

volume

Volume from the query

first_page

First page from the query

author_name

Author name from the query

key

User-defined key from the query

pmid

Matched PMID (None if not found)

status

Match status (“found”, “not_found”, or “ambiguous”)

author_name
first_page
journal
key
pmid
status
volume
year
class pubmed_client.CitationMatches

Bases: object

Results from ECitMatch API for batch citation matching

matches

List of CitationMatch results

found_count()

Get the number of successful matches

matches

Get the list of citation match results

class pubmed_client.DatabaseCount

Bases: object

Record count for a single NCBI database from the EGQuery API

db_name

Internal database name (e.g., “pubmed”, “pmc”)

menu_name

Human-readable database name (e.g., “PubMed”, “PMC”)

count

Number of matching records

status

Query status (e.g., “Ok”)

count
db_name
menu_name
status
class pubmed_client.GlobalQueryResults

Bases: object

Results from EGQuery API for global database search

term

The query term that was searched

results

List of DatabaseCount results for each database

count_for(db_name)

Get count for a specific database

non_zero()

Get results with count > 0

results

Get the list of database count results

term
class pubmed_client.EPostResult

Bases: object

Python wrapper for EPostResult

Result from EPost API for uploading PMIDs to the NCBI History server. Contains WebEnv and query_key identifiers for use with subsequent API calls.

webenv

WebEnv session identifier

query_key

Query key for the uploaded IDs within the session

Examples

>>> client = PubMedClient()
>>> result = client.epost(["31978945", "33515491"])
>>> print(f"WebEnv: {result.webenv}, Query Key: {result.query_key}")
query_key
webenv
class pubmed_client.SpellCheckResult

Bases: object

Python wrapper for SpellCheckResult

database

The database that was queried

query

The original query string

corrected_query

The corrected/suggested query

has_corrections

Whether any spelling corrections were made

replacements

List of corrected terms

Examples

>>> client = PubMedClient()
>>> result = client.spell_check("asthmaa")
>>> print(result.corrected_query)
"asthma"
>>> result.has_corrections
True
corrected_query
database
has_corrections
query
replacements
class pubmed_client.ArticleSummary

Bases: object

Lightweight article summary from the ESummary API

Contains basic metadata (title, authors, journal, dates) without abstracts, MeSH terms, or chemical lists. Faster than PubMedArticle for bulk retrieval.

Examples

>>> client = PubMedClient()
>>> summaries = client.fetch_summaries(["31978945", "33515491"])
>>> for s in summaries:
...     print(f"{s.pmid}: {s.title} ({s.pub_date})")
authors
doi
epub_date
essn
full_journal_name
issn
issue
journal
languages
pages
pmc_id
pmc_ref_count
pmid
pub_date
pub_types
record_status
sort_pub_date
title
volume
class pubmed_client.PmcAffiliation

Bases: object

Python wrapper for PMC Affiliation

address
country
department
id
institution
class pubmed_client.PmcAuthor

Bases: object

Python wrapper for PMC Author

affiliations()

Get list of affiliations

email
full_name
given_names
is_corresponding
orcid
roles()

Get list of roles/contributions

surname
class pubmed_client.Figure

Bases: object

Python wrapper for Figure

alt_text
caption
fig_type
graphic_href
id
label
class pubmed_client.ExtractedFigure

Bases: object

Python wrapper for ExtractedFigure

Represents a figure that has been extracted from a PMC tar.gz archive, combining XML metadata with actual file information.

dimensions

Image dimensions as (width, height) tuple if available

extracted_file_path

Actual file path where the figure was extracted

figure

Figure metadata from XML (caption, label, etc.)

file_size

File size in bytes

class pubmed_client.Table

Bases: object

Python wrapper for Table

caption
id
label
class pubmed_client.Reference

Bases: object

Python wrapper for Reference

doi
id
pmid
source
title
year
class pubmed_client.ArticleSection

Bases: object

Python wrapper for ArticleSection

content
section_type
title
class pubmed_client.PmcFullText

Bases: object

Python wrapper for PmcFullText

authors()

Get list of authors

doi
figures()

Get list of all figures from all sections

pmcid
pmid
references()

Get list of references

sections()

Get list of sections

tables()

Get list of all tables from all sections

title
to_markdown()

Convert the article to Markdown format

Returns:

A Markdown-formatted string representation of the article

Example

>>> full_text = client.pmc.fetch_full_text("PMC7906746")
>>> markdown = full_text.to_markdown()
>>> print(markdown)
class pubmed_client.OaSubsetInfo

Bases: object

Python wrapper for OaSubsetInfo

Information about OA (Open Access) subset availability for a PMC article. The OA subset contains articles with programmatic access to full-text XML.

citation

Citation string (if available)

download_format

Format of the download (e.g., “tgz”, “pdf”)

Download link for tar.gz package (if available)

error_code

Error code if not in OA subset

error_message

Error message if not in OA subset

is_oa_subset

Whether the article is in the OA subset

license

License type (if available)

pmcid

PMC ID (e.g., “PMC7906746”)

retracted

Whether the article is retracted

updated

Last updated timestamp for the download

class pubmed_client.PubMedClient

Bases: object

PubMed client for searching and fetching article metadata

Examples

>>> client = PubMedClient()
>>> articles = client.search_and_fetch("covid-19", 10)
>>> article = client.fetch_article("31978945")
epost(pmids: list[str]) EPostResult

Upload a list of PMIDs to the NCBI History server using EPost

Stores UIDs on the server and returns WebEnv/query_key identifiers that can be used with subsequent API calls for batch fetching.

Parameters:

pmids – List of PubMed IDs as strings

Returns:

EPostResult containing webenv and query_key

Examples

>>> client = PubMedClient()
>>> result = client.epost(["31978945", "33515491", "25760099"])
>>> print(f"WebEnv: {result.webenv}")
>>> print(f"Query Key: {result.query_key}")
fetch_all_by_pmids(pmids: list[str]) list[PubMedArticle]

Fetch all articles for a list of PMIDs using EPost and the History server

Uploads the PMID list via EPost (HTTP POST), then fetches articles in paginated batches. Recommended for large PMID lists (hundreds or thousands).

Parameters:

pmids – List of PubMed IDs as strings

Returns:

List of PubMedArticle objects

Examples

>>> client = PubMedClient()
>>> articles = client.fetch_all_by_pmids(["31978945", "33515491", "25760099"])
>>> for a in articles:
...     print(a.title)
fetch_article(pmid)

Fetch a single article by PMID

Parameters:

pmid – PubMed ID as a string

Returns:

PubMedArticle object

fetch_articles(pmids: list[str]) list[PubMedArticle]

Fetch multiple articles by PMIDs in a single batch request

This is significantly more efficient than fetching articles one by one, as it sends fewer HTTP requests to the NCBI API. For large numbers of PMIDs, the request is automatically split into batches of 200.

Parameters:

pmids – List of PubMed IDs as strings

Returns:

List of PubMedArticle objects

Examples

>>> client = PubMedClient()
>>> articles = client.fetch_articles(["31978945", "33515491", "25760099"])
>>> for article in articles:
...     print(f"{article.pmid}: {article.title}")
fetch_summaries(pmids: list[str]) list[ArticleSummary]

Fetch lightweight article summaries by PMIDs using the ESummary API

Returns basic metadata (title, authors, journal, dates, DOI) without abstracts, MeSH terms, or chemical lists. Faster than fetch_articles() for bulk metadata retrieval.

Parameters:

pmids – List of PubMed IDs as strings

Returns:

List of ArticleSummary objects

Examples

>>> client = PubMedClient()
>>> summaries = client.fetch_summaries(["31978945", "33515491"])
>>> for s in summaries:
...     print(f"{s.pmid}: {s.title}")
get_citations(pmids)

Get citing articles for given PMIDs

Returns articles that cite the specified PMIDs from the PubMed database only.

Important: Citation counts from this method may be LOWER than Google Scholar or scite.ai because this only includes peer-reviewed articles in PubMed. Other sources include preprints, books, and conference proceedings.

Example: PMID 31978945 shows ~14,000 citations in PubMed vs ~23,000 in scite.ai. This is expected - this method provides PubMed-specific citation data.

Parameters:

pmids – List of PubMed IDs

Returns:

Citations object containing citing article PMIDs

get_database_info(database)

Get detailed information about a specific database

Parameters:

database – Database name (e.g., “pubmed”, “pmc”)

Returns:

DatabaseInfo object

get_database_list()

Get list of all available NCBI databases

Returns:

List of database names

Get PMC links for given PMIDs (full-text availability)

Parameters:

pmids – List of PubMed IDs

Returns:

PmcLinks object containing available PMC IDs

Get related articles for given PMIDs

Parameters:

pmids – List of PubMed IDs

Returns:

RelatedArticles object

global_query(term)

Query all NCBI databases for record counts using the EGQuery API

Returns the number of records matching the query in each Entrez database. Useful for exploratory searches.

Parameters:

term – Search query string

Returns:

GlobalQueryResults object containing counts per database

Examples

>>> client = PubMedClient()
>>> results = client.global_query("asthma")
>>> for db in results.non_zero():
...     print(f"{db.menu_name}: {db.count}")
match_citations(citations)

Match citations to PMIDs using the ECitMatch API

Takes citation information (journal, year, volume, page, author) and returns corresponding PMIDs. Useful for identifying PMIDs from reference lists.

Parameters:

citations – List of CitationQuery objects

Returns:

CitationMatches object containing match results

Examples

>>> client = PubMedClient()
>>> queries = [
...     CitationQuery("proc natl acad sci u s a", "1991", "88", "3248", "mann bj", "Art1"),
...     CitationQuery("science", "1987", "235", "182", "palmenberg ac", "Art2"),
... ]
>>> results = client.match_citations(queries)
>>> for m in results.matches:
...     print(f"{m.key}: {m.pmid} ({m.status})")
search_and_fetch(query: str | SearchQuery, limit: int) list[PubMedArticle]

Search for articles and fetch their metadata

Parameters:
  • query – Search query (either a string or SearchQuery object)

  • limit – Maximum number of articles to return (ignored if query is SearchQuery)

Returns:

List of PubMedArticle objects

Examples

>>> client = PubMedClient()
>>> # Using string query
>>> articles = client.search_and_fetch("covid-19", 10)
>>> # Using SearchQuery object
>>> query = SearchQuery().query("cancer").published_after(2020).limit(50)
>>> articles = client.search_and_fetch(query, 0)  # limit parameter ignored
search_and_fetch_summaries(query: str | SearchQuery, limit: int) list[ArticleSummary]

Search and fetch lightweight summaries in a single operation

Combines search and ESummary fetch. Use this when you only need basic metadata and want faster retrieval than search_and_fetch().

Parameters:
  • query – Search query (either a string or SearchQuery object)

  • limit – Maximum number of articles (ignored if query is SearchQuery)

Returns:

List of ArticleSummary objects

Examples

>>> client = PubMedClient()
>>> summaries = client.search_and_fetch_summaries("covid-19", 20)
>>> for s in summaries:
...     print(f"{s.pmid}: {s.title}")
search_articles(query: str | SearchQuery, limit: int) list[str]

Search for articles and return PMIDs only

This method returns only the list of PMIDs matching the query, which is faster than fetching full article metadata.

Parameters:
  • query – Search query (either a string or SearchQuery object)

  • limit – Maximum number of PMIDs to return (ignored if query is SearchQuery)

Returns:

List of PMIDs as strings

Examples

>>> client = PubMedClient()
>>> # Using string query
>>> pmids = client.search_articles("covid-19", 100)
>>> pmids = client.search_articles("cancer[ti] AND therapy[tiab]", 50)
>>> # Using SearchQuery object
>>> query = SearchQuery().query("covid-19").limit(100)
>>> pmids = client.search_articles(query, 0)  # limit parameter ignored
spell_check(term)

Check spelling of a search term using the ESpell API

Provides spelling suggestions for terms within a single text query. Uses the PubMed database by default.

Parameters:

term – The search term to spell-check

Returns:

SpellCheckResult with the corrected query and details

Examples

>>> client = PubMedClient()
>>> result = client.spell_check("asthmaa")
>>> print(result.corrected_query)
"asthma"
spell_check_db(term, db)

Check spelling of a search term against a specific database

Spelling suggestions are database-specific, so use the same database you plan to search.

Parameters:
  • term – The search term to spell-check

  • db – The NCBI database to check against (e.g., “pubmed”, “pmc”)

Returns:

SpellCheckResult with the corrected query and details

Examples

>>> client = PubMedClient()
>>> result = client.spell_check_db("fiberblast", "pmc")
>>> print(result.corrected_query)
"fibroblast"
static with_config(config)

Create a new PubMed client with custom configuration

class pubmed_client.PmcClient

Bases: object

PMC client for fetching full-text articles

Examples

>>> client = PmcClient()
>>> full_text = client.fetch_full_text("PMC7906746")
>>> pmcid = client.check_pmc_availability("31978945")
check_pmc_availability(pmid)

Check if a PubMed article has PMC full text available

Parameters:

pmid – PubMed ID as a string

Returns:

PMC ID if available, None otherwise

download_and_extract_tar(pmcid, output_dir)

Download and extract PMC tar.gz archive

Downloads the tar.gz file for the specified PMC ID and extracts all files to the output directory.

Parameters:
  • pmcid – PMC ID (e.g., “PMC7906746” or “7906746”)

  • output_dir – Directory path where files should be extracted

Returns:

List of extracted file paths

Note

This method is only available on non-WASM platforms

Example

>>> client = PmcClient()
>>> files = client.download_and_extract_tar("PMC7906746", "./output")
>>> for file in files:
...     print(file)
extract_figures_with_captions(pmcid, output_dir)

Extract figures with captions from PMC article

Downloads the tar.gz file for the specified PMC ID, extracts all files, and matches figures with their captions from the XML metadata.

Parameters:
  • pmcid – PMC ID (e.g., “PMC7906746” or “7906746”)

  • output_dir – Directory path where files should be extracted

Returns:

List of ExtractedFigure objects containing metadata and file information

Note

This method is only available on non-WASM platforms

Example

>>> client = PmcClient()
>>> figures = client.extract_figures_with_captions("PMC7906746", "./output")
>>> for fig in figures:
...     print(f"{fig.figure.id}: {fig.extracted_file_path}")
...     print(f"  Caption: {fig.figure.caption}")
...     print(f"  Size: {fig.file_size} bytes")
...     print(f"  Dimensions: {fig.dimensions}")
fetch_full_text(pmcid)

Fetch full text article from PMC

Parameters:

pmcid – PMC ID (e.g., “PMC7906746”)

Returns:

PmcFullText object containing structured article content

is_oa_subset(pmcid)

Check if a PMC article is in the OA (Open Access) subset

The OA subset contains articles with programmatic access to full-text XML. Some publishers restrict programmatic access even though the article may be viewable on the PMC website.

Parameters:

pmcid – PMC ID (with or without “PMC” prefix, e.g., “PMC7906746” or “7906746”)

Returns:

OaSubsetInfo object containing detailed information about OA availability

Example

>>> client = PmcClient()
>>> oa_info = client.is_oa_subset("PMC7906746")
>>> if oa_info.is_oa_subset:
...     print(f"Article is in OA subset")
...     if oa_info.download_link:
...         print(f"Download: {oa_info.download_link}")
... else:
...     print(f"Not in OA subset: {oa_info.error_message}")
static with_config(config)

Create a new PMC client with custom configuration

class pubmed_client.Client

Bases: object

Combined client with both PubMed and PMC functionality

This is the main client you’ll typically use. It provides access to both PubMed metadata searches and PMC full-text retrieval.

Examples

>>> client = Client()
>>> # Access PubMed client
>>> articles = client.pubmed.search_and_fetch("covid-19", 10)
>>> # Access PMC client
>>> full_text = client.pmc.fetch_full_text("PMC7906746")
>>> # Search with full text
>>> results = client.search_with_full_text("covid-19", 5)
get_citations(pmids)

Get citing articles for given PMIDs

get_database_info(database)

Get detailed information about a specific database

get_database_list()

Get list of all available NCBI databases

Get PMC links for given PMIDs

Get related articles for given PMIDs

pmc

Get PMC client for full-text operations

pubmed

Get PubMed client for metadata operations

search_with_full_text(query, limit)

Search for articles and attempt to fetch full text for each

This is a convenience method that searches PubMed and attempts to fetch PMC full text for each result when available.

Parameters:
  • query – Search query string

  • limit – Maximum number of articles to process

Returns:

List of tuples (PubMedArticle, Optional[PmcFullText])

spell_check(term)

Check spelling of a search term using the ESpell API

Provides spelling suggestions for terms within a single text query. Uses the PubMed database by default.

Parameters:

term – The search term to spell-check

Returns:

SpellCheckResult with the corrected query and details

Examples

>>> client = Client()
>>> result = client.spell_check("asthmaa")
>>> print(result.corrected_query)
static with_config(config)

Create a new combined client with custom configuration

class pubmed_client.SearchQuery

Bases: object

Python wrapper for SearchQuery

Builder for constructing PubMed search queries programmatically.

Examples

>>> query = SearchQuery().query("covid-19").limit(10)
>>> query_string = query.build()
>>> print(query_string)
covid-19
and_(other)

Combine this query with another using AND logic

Combines two queries by wrapping each in parentheses and joining with AND. If either query is empty, returns the non-empty query. The result uses the higher limit of the two queries.

Parameters:

other – Another SearchQuery to combine with

Returns:

New query with combined logic

Return type:

SearchQuery

Example

>>> q1 = SearchQuery().query("covid-19")
>>> q2 = SearchQuery().query("vaccine")
>>> combined = q1.and_(q2)
>>> combined.build()
'(covid-19) AND (vaccine)'
>>> # Complex chaining
>>> result = SearchQuery().query("cancer") \\
...     .and_(SearchQuery().query("treatment")) \\
...     .and_(SearchQuery().query("2024[pdat]"))
>>> result.build()
'((cancer) AND (treatment)) AND (2024[pdat])'
article_type(type_name)

Filter by a single article type

Parameters:

type_name – Article type name (case-insensitive) Supported types: “Clinical Trial”, “Review”, “Systematic Review”, “Meta-Analysis”, “Case Reports”, “Randomized Controlled Trial” (or “RCT”), “Observational Study”

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If article type is not recognized

Example

>>> query = SearchQuery().query("cancer").article_type("Clinical Trial")
>>> query.build()
'cancer AND Clinical Trial[pt]'
article_types(types)

Filter by multiple article types (OR logic)

When multiple types are provided, they are combined with OR logic. Empty list is silently ignored (no filter added).

Parameters:

types – List of article type names (case-insensitive)

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If any article type is not recognized

Example

>>> query = SearchQuery().query("treatment").article_types(["RCT", "Meta-Analysis"])
>>> query.build()
'treatment AND (Randomized Controlled Trial[pt] OR Meta-Analysis[pt])'
build()

Build the final PubMed query string

Terms are joined with space separators (PubMed’s default OR logic).

Returns:

Query string for PubMed E-utilities API

Return type:

str

Raises:

ValueError – If no search terms have been added

Example

>>> query = SearchQuery().query("covid-19").query("treatment")
>>> query.build()
'covid-19 treatment'
exclude(excluded)

Exclude articles matching the given query

Excludes results from this query that match the excluded query. This is the recommended way to filter out unwanted results. If either query is empty, returns the base query unchanged.

Parameters:

excluded – SearchQuery representing articles to exclude

Returns:

New query with exclusion logic

Return type:

SearchQuery

Example

>>> base = SearchQuery().query("cancer treatment")
>>> exclude = SearchQuery().query("animal studies")
>>> filtered = base.exclude(exclude)
>>> filtered.build()
'(cancer treatment) NOT (animal studies)'
>>> # Exclude multiple types of studies
>>> human_only = SearchQuery().query("therapy") \\
...     .exclude(SearchQuery().query("animal studies")) \\
...     .exclude(SearchQuery().query("in vitro"))
free_full_text_only()

Filter to articles with free full text (open access)

This includes articles that are freely available from PubMed Central and other open access sources.

Returns:

Self for method chaining

Return type:

SearchQuery

Example

>>> query = SearchQuery().query("cancer").free_full_text_only()
>>> query.build()
'cancer AND free full text[sb]'
full_text_only()

Filter to articles with full text links

This includes both free full text and subscription-based full text articles. Use free_full_text_only() if you only want open access articles.

Returns:

Self for method chaining

Return type:

SearchQuery

Example

>>> query = SearchQuery().query("diabetes").full_text_only()
>>> query.build()
'diabetes AND full text[sb]'
get_limit()

Get the limit for this query

Returns the configured limit or the default of 20 if not set.

Returns:

Maximum number of results (default: 20)

Return type:

int

Example

>>> query = SearchQuery().query("cancer").limit(100)
>>> query.get_limit()
100
>>> query2 = SearchQuery().query("diabetes")
>>> query2.get_limit()
20
group()

Add parentheses around the current query for grouping

Wraps the query in parentheses to control operator precedence in complex queries. Returns an empty query if the current query is empty.

Returns:

New query wrapped in parentheses

Return type:

SearchQuery

Example

>>> query = SearchQuery().query("cancer").or_(SearchQuery().query("tumor")).group()
>>> query.build()
'((cancer) OR (tumor))'
>>> # Controlling precedence
>>> q1 = SearchQuery().query("a").or_(SearchQuery().query("b")).group()
>>> q2 = SearchQuery().query("c").or_(SearchQuery().query("d")).group()
>>> result = q1.and_(q2)
>>> result.build()
'(((a) OR (b))) AND (((c) OR (d)))'
limit(limit=None)

Set the maximum number of results to return

Validates that limit is >0 and ≤10,000. None is treated as “use default” (20).

Parameters:

limit – Maximum number of results (None = use default of 20)

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If limit ≤ 0 or limit > 10,000

Example

>>> query = SearchQuery().query("cancer").limit(50)
negate()

Negate this query using NOT logic

Wraps the current query with NOT operator. This is typically used in combination with other queries to exclude results. Returns an empty query if the current query is empty.

Returns:

New query with NOT logic

Return type:

SearchQuery

Example

>>> query = SearchQuery().query("cancer").negate()
>>> query.build()
'NOT (cancer)'
>>> # More practical: exclude from search results
>>> base = SearchQuery().query("treatment")
>>> excluded = SearchQuery().query("animal studies").negate()
>>> # (Note: use exclude() method for this pattern)
or_(other)

Combine this query with another using OR logic

Combines two queries by wrapping each in parentheses and joining with OR. If either query is empty, returns the non-empty query. The result uses the higher limit of the two queries.

Parameters:

other – Another SearchQuery to combine with

Returns:

New query with combined logic

Return type:

SearchQuery

Example

>>> q1 = SearchQuery().query("diabetes")
>>> q2 = SearchQuery().query("hypertension")
>>> combined = q1.or_(q2)
>>> combined.build()
'(diabetes) OR (hypertension)'
>>> # Find articles about either condition
>>> result = SearchQuery().query("cancer") \\
...     .or_(SearchQuery().query("tumor")) \\
...     .or_(SearchQuery().query("oncology"))
>>> result.build()
'((cancer) OR (tumor)) OR (oncology)'
pmc_only()

Filter to articles with PMC full text

This filters to articles that have full text available in PubMed Central (PMC).

Returns:

Self for method chaining

Return type:

SearchQuery

Example

>>> query = SearchQuery().query("genomics").pmc_only()
>>> query.build()
'genomics AND pmc[sb]'
published_after(year)

Filter to articles published after a specific year

Equivalent to published_between(year, None).

Parameters:

year – Year after which articles were published (must be 1800-3000)

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If year is outside the valid range (1800-3000)

Example

>>> query = SearchQuery().query("crispr").published_after(2020)
>>> query.build()
'crispr AND 2020:3000[pdat]'
published_before(year)

Filter to articles published before a specific year

Filters articles from 1900 up to and including the specified year.

Parameters:

year – Year before which articles were published (must be 1800-3000)

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If year is outside the valid range (1800-3000)

Example

>>> query = SearchQuery().query("genome").published_before(2020)
>>> query.build()
'genome AND 1900:2020[pdat]'
published_between(start_year, end_year=None)

Filter by publication date range

Filters articles published between start_year and end_year (inclusive). If end_year is None, filters from start_year onwards (up to year 3000).

Parameters:
  • start_year – Start year (inclusive, must be 1800-3000)

  • end_year – End year (inclusive, optional, must be 1800-3000 if provided)

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If years are outside valid range or start_year > end_year

Example

>>> # Filter to 2020-2024
>>> query = SearchQuery().query("cancer").published_between(2020, 2024)
>>> query.build()
'cancer AND 2020:2024[pdat]'
>>> # Filter from 2020 onwards
>>> query = SearchQuery().query("treatment").published_between(2020, None)
>>> query.build()
'treatment AND 2020:3000[pdat]'
published_in_year(year)

Filter to articles published in a specific year

Parameters:

year – Year to filter by (must be between 1800 and 3000)

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If year is outside the valid range (1800-3000)

Example

>>> query = SearchQuery().query("covid-19").published_in_year(2024)
>>> query.build()
'covid-19 AND 2024[pdat]'
query(term=None)

Add a search term to the query

Terms are accumulated (not replaced) and will be space-separated in the final query. None and empty strings (after trimming) are silently filtered out.

Parameters:

term – Search term string (None or empty strings are filtered)

Returns:

Self for method chaining

Return type:

SearchQuery

Example

>>> query = SearchQuery().query("covid-19").query("treatment")
>>> query.build()
'covid-19 treatment'
sort(sort_order)

Set the sort order for search results

Controls how PubMed orders the search results. The default (when not specified) is relevance-based sorting.

Parameters:

sort_order – Sort order string (case-insensitive) Supported values: “relevance”, “pub_date” (or “publication_date”), “author” (or “first_author”), “journal” (or “journal_name”)

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If sort order is not recognized

Example

>>> query = SearchQuery().query("cancer").sort("pub_date")
>>> # Results will be sorted by publication date (newest first)
terms(terms=None)

Add multiple search terms at once

Each term is processed like query(). None items and empty strings are filtered out.

Parameters:

terms – List of search term strings

Returns:

Self for method chaining

Return type:

SearchQuery

Example

>>> query = SearchQuery().terms(["covid-19", "vaccine", "efficacy"])
>>> query.build()
'covid-19 vaccine efficacy'