API Reference¶

Complete API reference for the pubmed_client module.

Python bindings for PubMed and PMC API client

This module provides a high-performance Python interface to PubMed and PMC APIs for retrieving biomedical research articles.

Main classes:: Client: Combined client for both PubMed and PMC PubMedClient: Client for PubMed metadata PmcClient: Client for PMC full-text articles ClientConfig: Configuration for API clients

Examples

>>> import pubmed_client
>>> client = pubmed_client.Client()
>>> articles = client.pubmed.search_and_fetch("covid-19", 10)
>>> for article in articles:
...     print(article.title)

class pubmed_client.ClientConfig¶

Bases: object

Python wrapper for ClientConfig

Configuration for PubMed and PMC clients.

Examples

>>> config = ClientConfig()
>>> config.with_api_key("your_api_key").with_email("you@example.com")
>>> client = Client.with_config(config)

with_api_key(api_key)¶: Set the NCBI API key for increased rate limits (10 req/sec instead of 3)

with_cache()¶: Enable default response caching

with_email(email)¶: Set the email address for identification (recommended by NCBI)

with_rate_limit(rate_limit)¶: Set custom rate limit in requests per second

with_timeout_seconds(timeout_seconds)¶: Set HTTP request timeout in seconds

with_tool(tool)¶: Set the tool name for identification (default: “pubmed-client-py”)

class pubmed_client.Affiliation¶

Bases: object

Python wrapper for Author affiliation

address¶

country¶

department¶

email¶

institution¶

class pubmed_client.Author¶

Bases: object

Python wrapper for Author

affiliations()¶: Get list of affiliations

email¶

full_name¶

given_names¶

initials¶

is_corresponding¶

orcid¶

roles()¶: Get list of roles/contributions

suffix¶

surname¶

class pubmed_client.PubMedArticle¶

Bases: object

Python wrapper for PubMedArticle

abstract_text¶

article_types()¶: Get article types

author_count¶

authors()¶: Get list of authors

doi¶

issn¶

issue¶

journal¶

journal_abbreviation¶

keywords()¶: Get keywords (empty list when none are present)

language¶

pages¶

pmc_id¶

pmid¶

pub_date¶

title¶

volume¶

class pubmed_client.RelatedArticles¶

Bases: object

Python wrapper for RelatedArticles

link_type¶

related_pmids¶

source_pmids¶

class pubmed_client.PmcLinks¶

Bases: object

Python wrapper for PmcLinks

pmc_ids¶

source_pmids¶

class pubmed_client.Citations¶

Bases: object

Python wrapper for Citations

citing_pmids¶

source_pmids¶

class pubmed_client.DatabaseInfo¶

Bases: object

Python wrapper for DatabaseInfo

build¶

count¶

description¶

last_update¶

menu_name¶

name¶

class pubmed_client.CitationQuery(journal, year, volume, first_page, author_name, key)¶

Bases: object

Input for a single citation match query

Used with the ECitMatch API to find PMIDs from citation information (journal, year, volume, page, author).

Examples

>>> query = CitationQuery(
...     journal="proc natl acad sci u s a",
...     year="1991",
...     volume="88",
...     first_page="3248",
...     author_name="mann bj",
...     key="Art1",
... )

author_name¶

first_page¶

journal¶

key¶

volume¶

year¶

class pubmed_client.CitationMatch¶

Bases: object

Result of a single citation match from the ECitMatch API

journal¶: Journal title from the query

year¶: Year from the query

volume¶: Volume from the query

first_page¶: First page from the query

author_name¶: Author name from the query

key¶: User-defined key from the query

pmid¶: Matched PMID (None if not found)

status¶: Match status (“found”, “not_found”, or “ambiguous”)

author_name¶

first_page¶

journal¶

key¶

pmid¶

status¶

volume¶

year¶

class pubmed_client.CitationMatches¶

Bases: object

Results from ECitMatch API for batch citation matching

matches¶: List of CitationMatch results

found_count()¶: Get the number of successful matches

matches¶: Get the list of citation match results

class pubmed_client.DatabaseCount¶

Bases: object

Record count for a single NCBI database from the EGQuery API

db_name¶: Internal database name (e.g., “pubmed”, “pmc”)

menu_name¶: Human-readable database name (e.g., “PubMed”, “PMC”)

count¶: Number of matching records

status¶: Query status (e.g., “Ok”)

count¶

db_name¶

menu_name¶

status¶

class pubmed_client.GlobalQueryResults¶

Bases: object

Results from EGQuery API for global database search

term¶: The query term that was searched

results¶: List of DatabaseCount results for each database

count_for(db_name)¶: Get count for a specific database

non_zero()¶: Get results with count > 0

results¶: Get the list of database count results

term¶

class pubmed_client.EPostResult¶

Bases: object

Python wrapper for EPostResult

Result from EPost API for uploading PMIDs to the NCBI History server. Contains WebEnv and query_key identifiers for use with subsequent API calls.

webenv¶: WebEnv session identifier

query_key¶: Query key for the uploaded IDs within the session

Examples

>>> client = PubMedClient()
>>> result = client.epost(["31978945", "33515491"])
>>> print(f"WebEnv: {result.webenv}, Query Key: {result.query_key}")

query_key¶

webenv¶

class pubmed_client.SpellCheckResult¶

Bases: object

Python wrapper for SpellCheckResult

database¶: The database that was queried

query¶: The original query string

corrected_query¶: The corrected/suggested query

has_corrections¶: Whether any spelling corrections were made

replacements¶: List of corrected terms

Examples

>>> client = PubMedClient()
>>> result = client.spell_check("asthmaa")
>>> print(result.corrected_query)
"asthma"
>>> result.has_corrections
True

corrected_query¶

database¶

has_corrections¶

query¶

replacements¶

class pubmed_client.ArticleSummary¶

Bases: object

Lightweight article summary from the ESummary API

Contains basic metadata (title, authors, journal, dates) without abstracts, MeSH terms, or chemical lists. Faster than PubMedArticle for bulk retrieval.

Examples

>>> client = PubMedClient()
>>> summaries = client.fetch_summaries(["31978945", "33515491"])
>>> for s in summaries:
...     print(f"{s.pmid}: {s.title} ({s.pub_date})")

authors¶

doi¶

epub_date¶

essn¶

full_journal_name¶

issn¶

issue¶

journal¶

languages¶

pages¶

pmc_id¶

pmc_ref_count¶

pmid¶

pub_date¶

pub_types¶

record_status¶

sort_pub_date¶

title¶

volume¶

class pubmed_client.PmcAffiliation¶

Bases: object

Python wrapper for PMC Affiliation

address¶

country¶

department¶

id¶

institution¶

class pubmed_client.PmcAuthor¶

Bases: object

Python wrapper for PMC Author

affiliations()¶: Get list of affiliations

email¶

full_name¶

given_names¶

is_corresponding¶

orcid¶

roles()¶: Get list of roles/contributions

surname¶

class pubmed_client.Figure¶

Bases: object

Python wrapper for Figure

alt_text¶

caption¶

fig_type¶

graphic_href¶

id¶

label¶

class pubmed_client.ExtractedFigure¶

Bases: object

Python wrapper for ExtractedFigure

Represents a figure that has been extracted from a PMC tar.gz archive, combining XML metadata with actual file information.

dimensions¶: Image dimensions as (width, height) tuple if available

extracted_file_path¶: Actual file path where the figure was extracted

figure¶: Figure metadata from XML (caption, label, etc.)

file_size¶: File size in bytes

class pubmed_client.Table¶

Bases: object

Python wrapper for Table

caption¶

id¶

label¶

class pubmed_client.Reference¶

Bases: object

Python wrapper for Reference

doi¶

id¶

pmid¶

source¶

title¶

year¶

class pubmed_client.ArticleSection¶

Bases: object

Python wrapper for ArticleSection

content¶

section_type¶

title¶

class pubmed_client.PmcFullText¶

Bases: object

Python wrapper for PmcFullText

authors()¶: Get list of authors

doi¶

figures()¶: Get list of all figures from all sections

pmcid¶

pmid¶

references()¶: Get list of references

sections()¶: Get list of sections

tables()¶: Get list of all tables from all sections

title¶

to_markdown()¶

Convert the article to Markdown format

Returns:: A Markdown-formatted string representation of the article

Example

>>> full_text = client.pmc.fetch_full_text("PMC7906746")
>>> markdown = full_text.to_markdown()
>>> print(markdown)

class pubmed_client.OaSubsetInfo¶

Bases: object

Python wrapper for OaSubsetInfo

Information about OA (Open Access) subset availability for a PMC article. The OA subset contains articles with programmatic access to full-text XML.

citation¶: Citation string (if available)

download_format¶: Format of the download (e.g., “tgz”, “pdf”)

download_link¶: Download link for tar.gz package (if available)

error_code¶: Error code if not in OA subset

error_message¶: Error message if not in OA subset

is_oa_subset¶: Whether the article is in the OA subset

license¶: License type (if available)

pmcid¶: PMC ID (e.g., “PMC7906746”)

retracted¶: Whether the article is retracted

updated¶: Last updated timestamp for the download

class pubmed_client.PubMedClient¶

Bases: object

PubMed client for searching and fetching article metadata

Examples

>>> client = PubMedClient()
>>> articles = client.search_and_fetch("covid-19", 10)
>>> article = client.fetch_article("31978945")

epost(pmids: list[str]) → EPostResult¶

–

Upload a list of PMIDs to the NCBI History server using EPost

Stores UIDs on the server and returns WebEnv/query_key identifiers that can be used with subsequent API calls for batch fetching.

Parameters:: pmids – List of PubMed IDs as strings
Returns:: EPostResult containing webenv and query_key

Examples

>>> client = PubMedClient()
>>> result = client.epost(["31978945", "33515491", "25760099"])
>>> print(f"WebEnv: {result.webenv}")
>>> print(f"Query Key: {result.query_key}")

fetch_all_by_pmids(pmids: list[str]) → list[PubMedArticle]¶

–

Fetch all articles for a list of PMIDs using EPost and the History server

Uploads the PMID list via EPost (HTTP POST), then fetches articles in paginated batches. Recommended for large PMID lists (hundreds or thousands).

Parameters:: pmids – List of PubMed IDs as strings
Returns:: List of PubMedArticle objects

Examples

>>> client = PubMedClient()
>>> articles = client.fetch_all_by_pmids(["31978945", "33515491", "25760099"])
>>> for a in articles:
...     print(a.title)

fetch_article(pmid)¶

Fetch a single article by PMID

Parameters:: pmid – PubMed ID as a string
Returns:: PubMedArticle object

fetch_articles(pmids: list[str]) → list[PubMedArticle]¶

–

Fetch multiple articles by PMIDs in a single batch request

This is significantly more efficient than fetching articles one by one, as it sends fewer HTTP requests to the NCBI API. For large numbers of PMIDs, the request is automatically split into batches of 200.

Parameters:: pmids – List of PubMed IDs as strings
Returns:: List of PubMedArticle objects

Examples

>>> client = PubMedClient()
>>> articles = client.fetch_articles(["31978945", "33515491", "25760099"])
>>> for article in articles:
...     print(f"{article.pmid}: {article.title}")

fetch_summaries(pmids: list[str]) → list[ArticleSummary]¶

–

Fetch lightweight article summaries by PMIDs using the ESummary API

Returns basic metadata (title, authors, journal, dates, DOI) without abstracts, MeSH terms, or chemical lists. Faster than fetch_articles() for bulk metadata retrieval.

Parameters:: pmids – List of PubMed IDs as strings
Returns:: List of ArticleSummary objects

Examples

>>> client = PubMedClient()
>>> summaries = client.fetch_summaries(["31978945", "33515491"])
>>> for s in summaries:
...     print(f"{s.pmid}: {s.title}")

get_citations(pmids)¶

Get citing articles for given PMIDs

Returns articles that cite the specified PMIDs from the PubMed database only.

Important: Citation counts from this method may be LOWER than Google Scholar or scite.ai because this only includes peer-reviewed articles in PubMed. Other sources include preprints, books, and conference proceedings.

Example: PMID 31978945 shows ~14,000 citations in PubMed vs ~23,000 in scite.ai. This is expected - this method provides PubMed-specific citation data.

Parameters:: pmids – List of PubMed IDs
Returns:: Citations object containing citing article PMIDs

get_database_info(database)¶

Get detailed information about a specific database

Parameters:: database – Database name (e.g., “pubmed”, “pmc”)
Returns:: DatabaseInfo object

get_database_list()¶

Get list of all available NCBI databases

Returns:: List of database names

get_pmc_links(pmids)¶

Get PMC links for given PMIDs (full-text availability)

Parameters:: pmids – List of PubMed IDs
Returns:: PmcLinks object containing available PMC IDs

get_related_articles(pmids)¶

Get related articles for given PMIDs

Parameters:: pmids – List of PubMed IDs
Returns:: RelatedArticles object

global_query(term)¶

Query all NCBI databases for record counts using the EGQuery API

Returns the number of records matching the query in each Entrez database. Useful for exploratory searches.

Parameters:: term – Search query string
Returns:: GlobalQueryResults object containing counts per database

Examples

>>> client = PubMedClient()
>>> results = client.global_query("asthma")
>>> for db in results.non_zero():
...     print(f"{db.menu_name}: {db.count}")

match_citations(citations)¶

Match citations to PMIDs using the ECitMatch API

Takes citation information (journal, year, volume, page, author) and returns corresponding PMIDs. Useful for identifying PMIDs from reference lists.

Parameters:: citations – List of CitationQuery objects
Returns:: CitationMatches object containing match results

Examples

>>> client = PubMedClient()
>>> queries = [
...     CitationQuery("proc natl acad sci u s a", "1991", "88", "3248", "mann bj", "Art1"),
...     CitationQuery("science", "1987", "235", "182", "palmenberg ac", "Art2"),
... ]
>>> results = client.match_citations(queries)
>>> for m in results.matches:
...     print(f"{m.key}: {m.pmid} ({m.status})")

search_and_fetch(query: str | SearchQuery, limit: int) → list[PubMedArticle]¶

–

Search for articles and fetch their metadata

Parameters:

query – Search query (either a string or SearchQuery object)
limit – Maximum number of articles to return (ignored if query is SearchQuery)

Returns:

List of PubMedArticle objects

Examples

>>> client = PubMedClient()
>>> # Using string query
>>> articles = client.search_and_fetch("covid-19", 10)
>>> # Using SearchQuery object
>>> query = SearchQuery().query("cancer").published_after(2020).limit(50)
>>> articles = client.search_and_fetch(query, 0)  # limit parameter ignored

search_and_fetch_summaries(query: str | SearchQuery, limit: int) → list[ArticleSummary]¶

–

Search and fetch lightweight summaries in a single operation

Combines search and ESummary fetch. Use this when you only need basic metadata and want faster retrieval than search_and_fetch().

Parameters:

query – Search query (either a string or SearchQuery object)
limit – Maximum number of articles (ignored if query is SearchQuery)

Returns:

List of ArticleSummary objects

Examples

>>> client = PubMedClient()
>>> summaries = client.search_and_fetch_summaries("covid-19", 20)
>>> for s in summaries:
...     print(f"{s.pmid}: {s.title}")

search_articles(query: str | SearchQuery, limit: int) → list[str]¶

–

Search for articles and return PMIDs only

This method returns only the list of PMIDs matching the query, which is faster than fetching full article metadata.

Parameters:

query – Search query (either a string or SearchQuery object)
limit – Maximum number of PMIDs to return (ignored if query is SearchQuery)

Returns:

List of PMIDs as strings

Examples

>>> client = PubMedClient()
>>> # Using string query
>>> pmids = client.search_articles("covid-19", 100)
>>> pmids = client.search_articles("cancer[ti] AND therapy[tiab]", 50)
>>> # Using SearchQuery object
>>> query = SearchQuery().query("covid-19").limit(100)
>>> pmids = client.search_articles(query, 0)  # limit parameter ignored

spell_check(term)¶

Check spelling of a search term using the ESpell API

Provides spelling suggestions for terms within a single text query. Uses the PubMed database by default.

Parameters:: term – The search term to spell-check
Returns:: SpellCheckResult with the corrected query and details

Examples

>>> client = PubMedClient()
>>> result = client.spell_check("asthmaa")
>>> print(result.corrected_query)
"asthma"

spell_check_db(term, db)¶

Check spelling of a search term against a specific database

Spelling suggestions are database-specific, so use the same database you plan to search.

Parameters:

term – The search term to spell-check
db – The NCBI database to check against (e.g., “pubmed”, “pmc”)

Returns:

SpellCheckResult with the corrected query and details

Examples

>>> client = PubMedClient()
>>> result = client.spell_check_db("fiberblast", "pmc")
>>> print(result.corrected_query)
"fibroblast"

static with_config(config)¶: Create a new PubMed client with custom configuration

class pubmed_client.PmcClient¶

Bases: object

PMC client for fetching full-text articles

Examples

>>> client = PmcClient()
>>> full_text = client.fetch_full_text("PMC7906746")
>>> pmcid = client.check_pmc_availability("31978945")

check_pmc_availability(pmid)¶

Check if a PubMed article has PMC full text available

Parameters:: pmid – PubMed ID as a string
Returns:: PMC ID if available, None otherwise

download_and_extract_tar(pmcid, output_dir)¶

Download and extract PMC tar.gz archive

Downloads the tar.gz file for the specified PMC ID and extracts all files to the output directory.

Parameters:

pmcid – PMC ID (e.g., “PMC7906746” or “7906746”)
output_dir – Directory path where files should be extracted

Returns:

List of extracted file paths

Note

This method is only available on non-WASM platforms

Example

>>> client = PmcClient()
>>> files = client.download_and_extract_tar("PMC7906746", "./output")
>>> for file in files:
...     print(file)

extract_figures_with_captions(pmcid, output_dir)¶

Extract figures with captions from PMC article

Downloads the tar.gz file for the specified PMC ID, extracts all files, and matches figures with their captions from the XML metadata.

Parameters:

pmcid – PMC ID (e.g., “PMC7906746” or “7906746”)
output_dir – Directory path where files should be extracted

Returns:

List of ExtractedFigure objects containing metadata and file information

Note

This method is only available on non-WASM platforms

Example

>>> client = PmcClient()
>>> figures = client.extract_figures_with_captions("PMC7906746", "./output")
>>> for fig in figures:
...     print(f"{fig.figure.id}: {fig.extracted_file_path}")
...     print(f"  Caption: {fig.figure.caption}")
...     print(f"  Size: {fig.file_size} bytes")
...     print(f"  Dimensions: {fig.dimensions}")

fetch_full_text(pmcid)¶

Fetch full text article from PMC

Parameters:: pmcid – PMC ID (e.g., “PMC7906746”)
Returns:: PmcFullText object containing structured article content

is_oa_subset(pmcid)¶

Check if a PMC article is in the OA (Open Access) subset

The OA subset contains articles with programmatic access to full-text XML. Some publishers restrict programmatic access even though the article may be viewable on the PMC website.

Parameters:: pmcid – PMC ID (with or without “PMC” prefix, e.g., “PMC7906746” or “7906746”)
Returns:: OaSubsetInfo object containing detailed information about OA availability

Example

>>> client = PmcClient()
>>> oa_info = client.is_oa_subset("PMC7906746")
>>> if oa_info.is_oa_subset:
...     print(f"Article is in OA subset")
...     if oa_info.download_link:
...         print(f"Download: {oa_info.download_link}")
... else:
...     print(f"Not in OA subset: {oa_info.error_message}")

static with_config(config)¶: Create a new PMC client with custom configuration

class pubmed_client.Client¶

Bases: object

Combined client with both PubMed and PMC functionality

This is the main client you’ll typically use. It provides access to both PubMed metadata searches and PMC full-text retrieval.

Examples

>>> client = Client()
>>> # Access PubMed client
>>> articles = client.pubmed.search_and_fetch("covid-19", 10)
>>> # Access PMC client
>>> full_text = client.pmc.fetch_full_text("PMC7906746")
>>> # Search with full text
>>> results = client.search_with_full_text("covid-19", 5)

get_citations(pmids)¶: Get citing articles for given PMIDs

get_database_info(database)¶: Get detailed information about a specific database

get_database_list()¶: Get list of all available NCBI databases

get_pmc_links(pmids)¶: Get PMC links for given PMIDs

get_related_articles(pmids)¶: Get related articles for given PMIDs

pmc¶: Get PMC client for full-text operations

pubmed¶: Get PubMed client for metadata operations

search_with_full_text(query, limit)¶

Search for articles and attempt to fetch full text for each

This is a convenience method that searches PubMed and attempts to fetch PMC full text for each result when available.

Parameters:

query – Search query string
limit – Maximum number of articles to process

Returns:

List of tuples (PubMedArticle, Optional[PmcFullText])

spell_check(term)¶

Check spelling of a search term using the ESpell API

Provides spelling suggestions for terms within a single text query. Uses the PubMed database by default.

Parameters:: term – The search term to spell-check
Returns:: SpellCheckResult with the corrected query and details

Examples

>>> client = Client()
>>> result = client.spell_check("asthmaa")
>>> print(result.corrected_query)

static with_config(config)¶: Create a new combined client with custom configuration

class pubmed_client.SearchQuery¶

Bases: object

Python wrapper for SearchQuery

Builder for constructing PubMed search queries programmatically.

Examples

>>> query = SearchQuery().query("covid-19").limit(10)
>>> query_string = query.build()
>>> print(query_string)
covid-19

and_(other)¶

Combine this query with another using AND logic

Combines two queries by wrapping each in parentheses and joining with AND. If either query is empty, returns the non-empty query. The result uses the higher limit of the two queries.

Parameters:: other – Another SearchQuery to combine with
Returns:: New query with combined logic
Return type:: SearchQuery

Example

>>> q1 = SearchQuery().query("covid-19")
>>> q2 = SearchQuery().query("vaccine")
>>> combined = q1.and_(q2)
>>> combined.build()
'(covid-19) AND (vaccine)'

>>> # Complex chaining
>>> result = SearchQuery().query("cancer") \\
...     .and_(SearchQuery().query("treatment")) \\
...     .and_(SearchQuery().query("2024[pdat]"))
>>> result.build()
'((cancer) AND (treatment)) AND (2024[pdat])'

article_type(type_name)¶

Filter by a single article type

Parameters:: type_name – Article type name (case-insensitive) Supported types: “Clinical Trial”, “Review”, “Systematic Review”, “Meta-Analysis”, “Case Reports”, “Randomized Controlled Trial” (or “RCT”), “Observational Study”
Returns:: Self for method chaining
Return type:: SearchQuery
Raises:: ValueError – If article type is not recognized

Example

>>> query = SearchQuery().query("cancer").article_type("Clinical Trial")
>>> query.build()
'cancer AND Clinical Trial[pt]'

article_types(types)¶

Filter by multiple article types (OR logic)

When multiple types are provided, they are combined with OR logic. Empty list is silently ignored (no filter added).

Parameters:: types – List of article type names (case-insensitive)
Returns:: Self for method chaining
Return type:: SearchQuery
Raises:: ValueError – If any article type is not recognized

Example

>>> query = SearchQuery().query("treatment").article_types(["RCT", "Meta-Analysis"])
>>> query.build()
'treatment AND (Randomized Controlled Trial[pt] OR Meta-Analysis[pt])'

build()¶

Build the final PubMed query string

Terms are joined with space separators (PubMed’s default OR logic).

Returns:: Query string for PubMed E-utilities API
Return type:: str
Raises:: ValueError – If no search terms have been added

Example

>>> query = SearchQuery().query("covid-19").query("treatment")
>>> query.build()
'covid-19 treatment'

exclude(excluded)¶

Exclude articles matching the given query

Excludes results from this query that match the excluded query. This is the recommended way to filter out unwanted results. If either query is empty, returns the base query unchanged.

Parameters:: excluded – SearchQuery representing articles to exclude
Returns:: New query with exclusion logic
Return type:: SearchQuery

Example

>>> base = SearchQuery().query("cancer treatment")
>>> exclude = SearchQuery().query("animal studies")
>>> filtered = base.exclude(exclude)
>>> filtered.build()
'(cancer treatment) NOT (animal studies)'

>>> # Exclude multiple types of studies
>>> human_only = SearchQuery().query("therapy") \\
...     .exclude(SearchQuery().query("animal studies")) \\
...     .exclude(SearchQuery().query("in vitro"))

free_full_text_only()¶

Filter to articles with free full text (open access)

This includes articles that are freely available from PubMed Central and other open access sources.

Returns:: Self for method chaining
Return type:: SearchQuery

Example

>>> query = SearchQuery().query("cancer").free_full_text_only()
>>> query.build()
'cancer AND free full text[sb]'

full_text_only()¶

Filter to articles with full text links

This includes both free full text and subscription-based full text articles. Use free_full_text_only() if you only want open access articles.

Returns:: Self for method chaining
Return type:: SearchQuery

Example

>>> query = SearchQuery().query("diabetes").full_text_only()
>>> query.build()
'diabetes AND full text[sb]'

get_limit()¶

Get the limit for this query

Returns the configured limit or the default of 20 if not set.

Returns:: Maximum number of results (default: 20)
Return type:: int

Example

>>> query = SearchQuery().query("cancer").limit(100)
>>> query.get_limit()
100
>>> query2 = SearchQuery().query("diabetes")
>>> query2.get_limit()
20

group()¶

Add parentheses around the current query for grouping

Wraps the query in parentheses to control operator precedence in complex queries. Returns an empty query if the current query is empty.

Returns:: New query wrapped in parentheses
Return type:: SearchQuery

Example

>>> query = SearchQuery().query("cancer").or_(SearchQuery().query("tumor")).group()
>>> query.build()
'((cancer) OR (tumor))'

>>> # Controlling precedence
>>> q1 = SearchQuery().query("a").or_(SearchQuery().query("b")).group()
>>> q2 = SearchQuery().query("c").or_(SearchQuery().query("d")).group()
>>> result = q1.and_(q2)
>>> result.build()
'(((a) OR (b))) AND (((c) OR (d)))'

limit(limit=None)¶

Set the maximum number of results to return

Validates that limit is >0 and ≤10,000. None is treated as “use default” (20).

Parameters:: limit – Maximum number of results (None = use default of 20)
Returns:: Self for method chaining
Return type:: SearchQuery
Raises:: ValueError – If limit ≤ 0 or limit > 10,000

Example

>>> query = SearchQuery().query("cancer").limit(50)

negate()¶

Negate this query using NOT logic

Wraps the current query with NOT operator. This is typically used in combination with other queries to exclude results. Returns an empty query if the current query is empty.

Returns:: New query with NOT logic
Return type:: SearchQuery

Example

>>> query = SearchQuery().query("cancer").negate()
>>> query.build()
'NOT (cancer)'

>>> # More practical: exclude from search results
>>> base = SearchQuery().query("treatment")
>>> excluded = SearchQuery().query("animal studies").negate()
>>> # (Note: use exclude() method for this pattern)

or_(other)¶

Combine this query with another using OR logic

Combines two queries by wrapping each in parentheses and joining with OR. If either query is empty, returns the non-empty query. The result uses the higher limit of the two queries.

Parameters:: other – Another SearchQuery to combine with
Returns:: New query with combined logic
Return type:: SearchQuery

Example

>>> q1 = SearchQuery().query("diabetes")
>>> q2 = SearchQuery().query("hypertension")
>>> combined = q1.or_(q2)
>>> combined.build()
'(diabetes) OR (hypertension)'

>>> # Find articles about either condition
>>> result = SearchQuery().query("cancer") \\
...     .or_(SearchQuery().query("tumor")) \\
...     .or_(SearchQuery().query("oncology"))
>>> result.build()
'((cancer) OR (tumor)) OR (oncology)'

pmc_only()¶

Filter to articles with PMC full text

This filters to articles that have full text available in PubMed Central (PMC).

Returns:: Self for method chaining
Return type:: SearchQuery

Example

>>> query = SearchQuery().query("genomics").pmc_only()
>>> query.build()
'genomics AND pmc[sb]'

published_after(year)¶

Filter to articles published after a specific year

Equivalent to published_between(year, None).

Parameters:: year – Year after which articles were published (must be 1800-3000)
Returns:: Self for method chaining
Return type:: SearchQuery
Raises:: ValueError – If year is outside the valid range (1800-3000)

Example

>>> query = SearchQuery().query("crispr").published_after(2020)
>>> query.build()
'crispr AND 2020:3000[pdat]'

published_before(year)¶

Filter to articles published before a specific year

Filters articles from 1900 up to and including the specified year.

Parameters:: year – Year before which articles were published (must be 1800-3000)
Returns:: Self for method chaining
Return type:: SearchQuery
Raises:: ValueError – If year is outside the valid range (1800-3000)

Example

>>> query = SearchQuery().query("genome").published_before(2020)
>>> query.build()
'genome AND 1900:2020[pdat]'

published_between(start_year, end_year=None)¶

Filter by publication date range

Filters articles published between start_year and end_year (inclusive). If end_year is None, filters from start_year onwards (up to year 3000).

Parameters:

start_year – Start year (inclusive, must be 1800-3000)
end_year – End year (inclusive, optional, must be 1800-3000 if provided)

Returns:

Self for method chaining

Return type:

SearchQuery

Raises:

ValueError – If years are outside valid range or start_year > end_year

Example

>>> # Filter to 2020-2024
>>> query = SearchQuery().query("cancer").published_between(2020, 2024)
>>> query.build()
'cancer AND 2020:2024[pdat]'

>>> # Filter from 2020 onwards
>>> query = SearchQuery().query("treatment").published_between(2020, None)
>>> query.build()
'treatment AND 2020:3000[pdat]'

published_in_year(year)¶

Filter to articles published in a specific year

Parameters:: year – Year to filter by (must be between 1800 and 3000)
Returns:: Self for method chaining
Return type:: SearchQuery
Raises:: ValueError – If year is outside the valid range (1800-3000)

Example

>>> query = SearchQuery().query("covid-19").published_in_year(2024)
>>> query.build()
'covid-19 AND 2024[pdat]'

query(term=None)¶

Add a search term to the query

Terms are accumulated (not replaced) and will be space-separated in the final query. None and empty strings (after trimming) are silently filtered out.

Parameters:: term – Search term string (None or empty strings are filtered)
Returns:: Self for method chaining
Return type:: SearchQuery

Example

>>> query = SearchQuery().query("covid-19").query("treatment")
>>> query.build()
'covid-19 treatment'

sort(sort_order)¶

Set the sort order for search results

Controls how PubMed orders the search results. The default (when not specified) is relevance-based sorting.

Parameters:: sort_order – Sort order string (case-insensitive) Supported values: “relevance”, “pub_date” (or “publication_date”), “author” (or “first_author”), “journal” (or “journal_name”)
Returns:: Self for method chaining
Return type:: SearchQuery
Raises:: ValueError – If sort order is not recognized

Example

>>> query = SearchQuery().query("cancer").sort("pub_date")
>>> # Results will be sorted by publication date (newest first)

terms(terms=None)¶

Add multiple search terms at once

Each term is processed like query(). None items and empty strings are filtered out.

Parameters:: terms – List of search term strings
Returns:: Self for method chaining
Return type:: SearchQuery

Example

>>> query = SearchQuery().terms(["covid-19", "vaccine", "efficacy"])
>>> query.build()
'covid-19 vaccine efficacy'