API Reference¶
Complete API reference for the pubmed_client module.
Python bindings for PubMed and PMC API client
This module provides a high-performance Python interface to PubMed and PMC APIs for retrieving biomedical research articles.
- Main classes:
Client: Combined client for both PubMed and PMC PubMedClient: Client for PubMed metadata PmcClient: Client for PMC full-text articles ClientConfig: Configuration for API clients
Examples
>>> import pubmed_client
>>> client = pubmed_client.Client()
>>> articles = client.pubmed.search_and_fetch("covid-19", 10)
>>> for article in articles:
... print(article.title)
- class pubmed_client.ClientConfig¶
Bases:
objectPython wrapper for ClientConfig
Configuration for PubMed and PMC clients.
Examples
>>> config = ClientConfig() >>> config.with_api_key("your_api_key").with_email("you@example.com") >>> client = Client.with_config(config)
- with_api_key(api_key)¶
Set the NCBI API key for increased rate limits (10 req/sec instead of 3)
- with_cache()¶
Enable default response caching
- with_email(email)¶
Set the email address for identification (recommended by NCBI)
- with_rate_limit(rate_limit)¶
Set custom rate limit in requests per second
- with_timeout_seconds(timeout_seconds)¶
Set HTTP request timeout in seconds
- with_tool(tool)¶
Set the tool name for identification (default: “pubmed-client-py”)
- class pubmed_client.Affiliation¶
Bases:
objectPython wrapper for Author affiliation
- address¶
- country¶
- department¶
- email¶
- institution¶
- class pubmed_client.Author¶
Bases:
objectPython wrapper for Author
- affiliations()¶
Get list of affiliations
- email¶
- full_name¶
- given_names¶
- initials¶
- is_corresponding¶
- orcid¶
- roles()¶
Get list of roles/contributions
- suffix¶
- surname¶
- class pubmed_client.PubMedArticle¶
Bases:
objectPython wrapper for PubMedArticle
- abstract_text¶
- article_types()¶
Get article types
- author_count¶
- authors()¶
Get list of authors
- doi¶
- issn¶
- issue¶
- journal¶
- journal_abbreviation¶
- keywords()¶
Get keywords
- language¶
- pages¶
- pmc_id¶
- pmid¶
- pub_date¶
- title¶
- volume¶
- class pubmed_client.RelatedArticles¶
Bases:
objectPython wrapper for RelatedArticles
- link_type¶
- source_pmids¶
- class pubmed_client.Citations¶
Bases:
objectPython wrapper for Citations
- citing_pmids¶
- source_pmids¶
- class pubmed_client.DatabaseInfo¶
Bases:
objectPython wrapper for DatabaseInfo
- build¶
- count¶
- description¶
- last_update¶
- name¶
- class pubmed_client.CitationQuery(journal, year, volume, first_page, author_name, key)¶
Bases:
objectInput for a single citation match query
Used with the ECitMatch API to find PMIDs from citation information (journal, year, volume, page, author).
Examples
>>> query = CitationQuery( ... journal="proc natl acad sci u s a", ... year="1991", ... volume="88", ... first_page="3248", ... author_name="mann bj", ... key="Art1", ... )
- author_name¶
- first_page¶
- journal¶
- key¶
- volume¶
- year¶
- class pubmed_client.CitationMatch¶
Bases:
objectResult of a single citation match from the ECitMatch API
- journal¶
Journal title from the query
- year¶
Year from the query
- volume¶
Volume from the query
- first_page¶
First page from the query
- author_name¶
Author name from the query
- key¶
User-defined key from the query
- pmid¶
Matched PMID (None if not found)
- status¶
Match status (“found”, “not_found”, or “ambiguous”)
- author_name¶
- first_page¶
- journal¶
- key¶
- pmid¶
- status¶
- volume¶
- year¶
- class pubmed_client.CitationMatches¶
Bases:
objectResults from ECitMatch API for batch citation matching
- matches¶
List of CitationMatch results
- found_count()¶
Get the number of successful matches
- matches¶
Get the list of citation match results
- class pubmed_client.DatabaseCount¶
Bases:
objectRecord count for a single NCBI database from the EGQuery API
- db_name¶
Internal database name (e.g., “pubmed”, “pmc”)
Human-readable database name (e.g., “PubMed”, “PMC”)
- count¶
Number of matching records
- status¶
Query status (e.g., “Ok”)
- count¶
- db_name¶
- menu_name¶
- status¶
- class pubmed_client.GlobalQueryResults¶
Bases:
objectResults from EGQuery API for global database search
- term¶
The query term that was searched
- results¶
List of DatabaseCount results for each database
- count_for(db_name)¶
Get count for a specific database
- non_zero()¶
Get results with count > 0
- results¶
Get the list of database count results
- term¶
- class pubmed_client.EPostResult¶
Bases:
objectPython wrapper for EPostResult
Result from EPost API for uploading PMIDs to the NCBI History server. Contains WebEnv and query_key identifiers for use with subsequent API calls.
- webenv¶
WebEnv session identifier
- query_key¶
Query key for the uploaded IDs within the session
Examples
>>> client = PubMedClient() >>> result = client.epost(["31978945", "33515491"]) >>> print(f"WebEnv: {result.webenv}, Query Key: {result.query_key}")
- query_key¶
- webenv¶
- class pubmed_client.SpellCheckResult¶
Bases:
objectPython wrapper for SpellCheckResult
- database¶
The database that was queried
- query¶
The original query string
- corrected_query¶
The corrected/suggested query
- has_corrections¶
Whether any spelling corrections were made
- replacements¶
List of corrected terms
Examples
>>> client = PubMedClient() >>> result = client.spell_check("asthmaa") >>> print(result.corrected_query) "asthma" >>> result.has_corrections True
- corrected_query¶
- database¶
- has_corrections¶
- query¶
- replacements¶
- class pubmed_client.ArticleSummary¶
Bases:
objectLightweight article summary from the ESummary API
Contains basic metadata (title, authors, journal, dates) without abstracts, MeSH terms, or chemical lists. Faster than PubMedArticle for bulk retrieval.
Examples
>>> client = PubMedClient() >>> summaries = client.fetch_summaries(["31978945", "33515491"]) >>> for s in summaries: ... print(f"{s.pmid}: {s.title} ({s.pub_date})")
- authors¶
- doi¶
- epub_date¶
- essn¶
- full_journal_name¶
- issn¶
- issue¶
- journal¶
- languages¶
- pages¶
- pmc_id¶
- pmc_ref_count¶
- pmid¶
- pub_date¶
- pub_types¶
- record_status¶
- sort_pub_date¶
- title¶
- volume¶
- class pubmed_client.PmcAffiliation¶
Bases:
objectPython wrapper for PMC Affiliation
- address¶
- country¶
- department¶
- id¶
- institution¶
- class pubmed_client.PmcAuthor¶
Bases:
objectPython wrapper for PMC Author
- affiliations()¶
Get list of affiliations
- email¶
- full_name¶
- given_names¶
- is_corresponding¶
- orcid¶
- roles()¶
Get list of roles/contributions
- surname¶
- class pubmed_client.Figure¶
Bases:
objectPython wrapper for Figure
- alt_text¶
- caption¶
- fig_type¶
- graphic_href¶
- id¶
- label¶
- class pubmed_client.ExtractedFigure¶
Bases:
objectPython wrapper for ExtractedFigure
Represents a figure that has been extracted from a PMC tar.gz archive, combining XML metadata with actual file information.
- dimensions¶
Image dimensions as (width, height) tuple if available
- extracted_file_path¶
Actual file path where the figure was extracted
- figure¶
Figure metadata from XML (caption, label, etc.)
- file_size¶
File size in bytes
- class pubmed_client.Reference¶
Bases:
objectPython wrapper for Reference
- doi¶
- id¶
- pmid¶
- source¶
- title¶
- year¶
- class pubmed_client.ArticleSection¶
Bases:
objectPython wrapper for ArticleSection
- content¶
- section_type¶
- title¶
- class pubmed_client.PmcFullText¶
Bases:
objectPython wrapper for PmcFullText
- authors()¶
Get list of authors
- doi¶
- figures()¶
Get list of all figures from all sections
- pmcid¶
- pmid¶
- references()¶
Get list of references
- sections()¶
Get list of sections
- tables()¶
Get list of all tables from all sections
- title¶
- to_markdown()¶
Convert the article to Markdown format
- Returns:
A Markdown-formatted string representation of the article
Example
>>> full_text = client.pmc.fetch_full_text("PMC7906746") >>> markdown = full_text.to_markdown() >>> print(markdown)
- class pubmed_client.OaSubsetInfo¶
Bases:
objectPython wrapper for OaSubsetInfo
Information about OA (Open Access) subset availability for a PMC article. The OA subset contains articles with programmatic access to full-text XML.
- citation¶
Citation string (if available)
- download_format¶
Format of the download (e.g., “tgz”, “pdf”)
- download_link¶
Download link for tar.gz package (if available)
- error_code¶
Error code if not in OA subset
- error_message¶
Error message if not in OA subset
- is_oa_subset¶
Whether the article is in the OA subset
- license¶
License type (if available)
- pmcid¶
PMC ID (e.g., “PMC7906746”)
- retracted¶
Whether the article is retracted
- updated¶
Last updated timestamp for the download
- class pubmed_client.PubMedClient¶
Bases:
objectPubMed client for searching and fetching article metadata
Examples
>>> client = PubMedClient() >>> articles = client.search_and_fetch("covid-19", 10) >>> article = client.fetch_article("31978945")
- epost(pmids: list[str]) EPostResult¶
–
Upload a list of PMIDs to the NCBI History server using EPost
Stores UIDs on the server and returns WebEnv/query_key identifiers that can be used with subsequent API calls for batch fetching.
- Parameters:
pmids – List of PubMed IDs as strings
- Returns:
EPostResult containing webenv and query_key
Examples
>>> client = PubMedClient() >>> result = client.epost(["31978945", "33515491", "25760099"]) >>> print(f"WebEnv: {result.webenv}") >>> print(f"Query Key: {result.query_key}")
- fetch_all_by_pmids(pmids: list[str]) list[PubMedArticle]¶
–
Fetch all articles for a list of PMIDs using EPost and the History server
Uploads the PMID list via EPost (HTTP POST), then fetches articles in paginated batches. Recommended for large PMID lists (hundreds or thousands).
- Parameters:
pmids – List of PubMed IDs as strings
- Returns:
List of PubMedArticle objects
Examples
>>> client = PubMedClient() >>> articles = client.fetch_all_by_pmids(["31978945", "33515491", "25760099"]) >>> for a in articles: ... print(a.title)
- fetch_article(pmid)¶
Fetch a single article by PMID
- Parameters:
pmid – PubMed ID as a string
- Returns:
PubMedArticle object
- fetch_articles(pmids: list[str]) list[PubMedArticle]¶
–
Fetch multiple articles by PMIDs in a single batch request
This is significantly more efficient than fetching articles one by one, as it sends fewer HTTP requests to the NCBI API. For large numbers of PMIDs, the request is automatically split into batches of 200.
- Parameters:
pmids – List of PubMed IDs as strings
- Returns:
List of PubMedArticle objects
Examples
>>> client = PubMedClient() >>> articles = client.fetch_articles(["31978945", "33515491", "25760099"]) >>> for article in articles: ... print(f"{article.pmid}: {article.title}")
- fetch_summaries(pmids: list[str]) list[ArticleSummary]¶
–
Fetch lightweight article summaries by PMIDs using the ESummary API
Returns basic metadata (title, authors, journal, dates, DOI) without abstracts, MeSH terms, or chemical lists. Faster than fetch_articles() for bulk metadata retrieval.
- Parameters:
pmids – List of PubMed IDs as strings
- Returns:
List of ArticleSummary objects
Examples
>>> client = PubMedClient() >>> summaries = client.fetch_summaries(["31978945", "33515491"]) >>> for s in summaries: ... print(f"{s.pmid}: {s.title}")
- get_citations(pmids)¶
Get citing articles for given PMIDs
Returns articles that cite the specified PMIDs from the PubMed database only.
Important: Citation counts from this method may be LOWER than Google Scholar or scite.ai because this only includes peer-reviewed articles in PubMed. Other sources include preprints, books, and conference proceedings.
Example: PMID 31978945 shows ~14,000 citations in PubMed vs ~23,000 in scite.ai. This is expected - this method provides PubMed-specific citation data.
- Parameters:
pmids – List of PubMed IDs
- Returns:
Citations object containing citing article PMIDs
- get_database_info(database)¶
Get detailed information about a specific database
- Parameters:
database – Database name (e.g., “pubmed”, “pmc”)
- Returns:
DatabaseInfo object
- get_database_list()¶
Get list of all available NCBI databases
- Returns:
List of database names
- get_pmc_links(pmids)¶
Get PMC links for given PMIDs (full-text availability)
- Parameters:
pmids – List of PubMed IDs
- Returns:
PmcLinks object containing available PMC IDs
Get related articles for given PMIDs
- Parameters:
pmids – List of PubMed IDs
- Returns:
RelatedArticles object
- global_query(term)¶
Query all NCBI databases for record counts using the EGQuery API
Returns the number of records matching the query in each Entrez database. Useful for exploratory searches.
- Parameters:
term – Search query string
- Returns:
GlobalQueryResults object containing counts per database
Examples
>>> client = PubMedClient() >>> results = client.global_query("asthma") >>> for db in results.non_zero(): ... print(f"{db.menu_name}: {db.count}")
- match_citations(citations)¶
Match citations to PMIDs using the ECitMatch API
Takes citation information (journal, year, volume, page, author) and returns corresponding PMIDs. Useful for identifying PMIDs from reference lists.
- Parameters:
citations – List of CitationQuery objects
- Returns:
CitationMatches object containing match results
Examples
>>> client = PubMedClient() >>> queries = [ ... CitationQuery("proc natl acad sci u s a", "1991", "88", "3248", "mann bj", "Art1"), ... CitationQuery("science", "1987", "235", "182", "palmenberg ac", "Art2"), ... ] >>> results = client.match_citations(queries) >>> for m in results.matches: ... print(f"{m.key}: {m.pmid} ({m.status})")
- search_and_fetch(query: str | SearchQuery, limit: int) list[PubMedArticle]¶
–
Search for articles and fetch their metadata
- Parameters:
query – Search query (either a string or SearchQuery object)
limit – Maximum number of articles to return (ignored if query is SearchQuery)
- Returns:
List of PubMedArticle objects
Examples
>>> client = PubMedClient() >>> # Using string query >>> articles = client.search_and_fetch("covid-19", 10) >>> # Using SearchQuery object >>> query = SearchQuery().query("cancer").published_after(2020).limit(50) >>> articles = client.search_and_fetch(query, 0) # limit parameter ignored
- search_and_fetch_summaries(query: str | SearchQuery, limit: int) list[ArticleSummary]¶
–
Search and fetch lightweight summaries in a single operation
Combines search and ESummary fetch. Use this when you only need basic metadata and want faster retrieval than search_and_fetch().
- Parameters:
query – Search query (either a string or SearchQuery object)
limit – Maximum number of articles (ignored if query is SearchQuery)
- Returns:
List of ArticleSummary objects
Examples
>>> client = PubMedClient() >>> summaries = client.search_and_fetch_summaries("covid-19", 20) >>> for s in summaries: ... print(f"{s.pmid}: {s.title}")
- search_articles(query: str | SearchQuery, limit: int) list[str]¶
–
Search for articles and return PMIDs only
This method returns only the list of PMIDs matching the query, which is faster than fetching full article metadata.
- Parameters:
query – Search query (either a string or SearchQuery object)
limit – Maximum number of PMIDs to return (ignored if query is SearchQuery)
- Returns:
List of PMIDs as strings
Examples
>>> client = PubMedClient() >>> # Using string query >>> pmids = client.search_articles("covid-19", 100) >>> pmids = client.search_articles("cancer[ti] AND therapy[tiab]", 50) >>> # Using SearchQuery object >>> query = SearchQuery().query("covid-19").limit(100) >>> pmids = client.search_articles(query, 0) # limit parameter ignored
- spell_check(term)¶
Check spelling of a search term using the ESpell API
Provides spelling suggestions for terms within a single text query. Uses the PubMed database by default.
- Parameters:
term – The search term to spell-check
- Returns:
SpellCheckResult with the corrected query and details
Examples
>>> client = PubMedClient() >>> result = client.spell_check("asthmaa") >>> print(result.corrected_query) "asthma"
- spell_check_db(term, db)¶
Check spelling of a search term against a specific database
Spelling suggestions are database-specific, so use the same database you plan to search.
- Parameters:
term – The search term to spell-check
db – The NCBI database to check against (e.g., “pubmed”, “pmc”)
- Returns:
SpellCheckResult with the corrected query and details
Examples
>>> client = PubMedClient() >>> result = client.spell_check_db("fiberblast", "pmc") >>> print(result.corrected_query) "fibroblast"
- static with_config(config)¶
Create a new PubMed client with custom configuration
- class pubmed_client.PmcClient¶
Bases:
objectPMC client for fetching full-text articles
Examples
>>> client = PmcClient() >>> full_text = client.fetch_full_text("PMC7906746") >>> pmcid = client.check_pmc_availability("31978945")
- check_pmc_availability(pmid)¶
Check if a PubMed article has PMC full text available
- Parameters:
pmid – PubMed ID as a string
- Returns:
PMC ID if available, None otherwise
- download_and_extract_tar(pmcid, output_dir)¶
Download and extract PMC tar.gz archive
Downloads the tar.gz file for the specified PMC ID and extracts all files to the output directory.
- Parameters:
pmcid – PMC ID (e.g., “PMC7906746” or “7906746”)
output_dir – Directory path where files should be extracted
- Returns:
List of extracted file paths
Note
This method is only available on non-WASM platforms
Example
>>> client = PmcClient() >>> files = client.download_and_extract_tar("PMC7906746", "./output") >>> for file in files: ... print(file)
- extract_figures_with_captions(pmcid, output_dir)¶
Extract figures with captions from PMC article
Downloads the tar.gz file for the specified PMC ID, extracts all files, and matches figures with their captions from the XML metadata.
- Parameters:
pmcid – PMC ID (e.g., “PMC7906746” or “7906746”)
output_dir – Directory path where files should be extracted
- Returns:
List of ExtractedFigure objects containing metadata and file information
Note
This method is only available on non-WASM platforms
Example
>>> client = PmcClient() >>> figures = client.extract_figures_with_captions("PMC7906746", "./output") >>> for fig in figures: ... print(f"{fig.figure.id}: {fig.extracted_file_path}") ... print(f" Caption: {fig.figure.caption}") ... print(f" Size: {fig.file_size} bytes") ... print(f" Dimensions: {fig.dimensions}")
- fetch_full_text(pmcid)¶
Fetch full text article from PMC
- Parameters:
pmcid – PMC ID (e.g., “PMC7906746”)
- Returns:
PmcFullText object containing structured article content
- is_oa_subset(pmcid)¶
Check if a PMC article is in the OA (Open Access) subset
The OA subset contains articles with programmatic access to full-text XML. Some publishers restrict programmatic access even though the article may be viewable on the PMC website.
- Parameters:
pmcid – PMC ID (with or without “PMC” prefix, e.g., “PMC7906746” or “7906746”)
- Returns:
OaSubsetInfo object containing detailed information about OA availability
Example
>>> client = PmcClient() >>> oa_info = client.is_oa_subset("PMC7906746") >>> if oa_info.is_oa_subset: ... print(f"Article is in OA subset") ... if oa_info.download_link: ... print(f"Download: {oa_info.download_link}") ... else: ... print(f"Not in OA subset: {oa_info.error_message}")
- static with_config(config)¶
Create a new PMC client with custom configuration
- class pubmed_client.Client¶
Bases:
objectCombined client with both PubMed and PMC functionality
This is the main client you’ll typically use. It provides access to both PubMed metadata searches and PMC full-text retrieval.
Examples
>>> client = Client() >>> # Access PubMed client >>> articles = client.pubmed.search_and_fetch("covid-19", 10) >>> # Access PMC client >>> full_text = client.pmc.fetch_full_text("PMC7906746") >>> # Search with full text >>> results = client.search_with_full_text("covid-19", 5)
- get_citations(pmids)¶
Get citing articles for given PMIDs
- get_database_info(database)¶
Get detailed information about a specific database
- get_database_list()¶
Get list of all available NCBI databases
- get_pmc_links(pmids)¶
Get PMC links for given PMIDs
Get related articles for given PMIDs
- pmc¶
Get PMC client for full-text operations
- pubmed¶
Get PubMed client for metadata operations
- search_with_full_text(query, limit)¶
Search for articles and attempt to fetch full text for each
This is a convenience method that searches PubMed and attempts to fetch PMC full text for each result when available.
- Parameters:
query – Search query string
limit – Maximum number of articles to process
- Returns:
List of tuples (PubMedArticle, Optional[PmcFullText])
- spell_check(term)¶
Check spelling of a search term using the ESpell API
Provides spelling suggestions for terms within a single text query. Uses the PubMed database by default.
- Parameters:
term – The search term to spell-check
- Returns:
SpellCheckResult with the corrected query and details
Examples
>>> client = Client() >>> result = client.spell_check("asthmaa") >>> print(result.corrected_query)
- static with_config(config)¶
Create a new combined client with custom configuration
- class pubmed_client.SearchQuery¶
Bases:
objectPython wrapper for SearchQuery
Builder for constructing PubMed search queries programmatically.
Examples
>>> query = SearchQuery().query("covid-19").limit(10) >>> query_string = query.build() >>> print(query_string) covid-19
- and_(other)¶
Combine this query with another using AND logic
Combines two queries by wrapping each in parentheses and joining with AND. If either query is empty, returns the non-empty query. The result uses the higher limit of the two queries.
- Parameters:
other – Another SearchQuery to combine with
- Returns:
New query with combined logic
- Return type:
Example
>>> q1 = SearchQuery().query("covid-19") >>> q2 = SearchQuery().query("vaccine") >>> combined = q1.and_(q2) >>> combined.build() '(covid-19) AND (vaccine)'
>>> # Complex chaining >>> result = SearchQuery().query("cancer") \\ ... .and_(SearchQuery().query("treatment")) \\ ... .and_(SearchQuery().query("2024[pdat]")) >>> result.build() '((cancer) AND (treatment)) AND (2024[pdat])'
- article_type(type_name)¶
Filter by a single article type
- Parameters:
type_name – Article type name (case-insensitive) Supported types: “Clinical Trial”, “Review”, “Systematic Review”, “Meta-Analysis”, “Case Reports”, “Randomized Controlled Trial” (or “RCT”), “Observational Study”
- Returns:
Self for method chaining
- Return type:
- Raises:
ValueError – If article type is not recognized
Example
>>> query = SearchQuery().query("cancer").article_type("Clinical Trial") >>> query.build() 'cancer AND Clinical Trial[pt]'
- article_types(types)¶
Filter by multiple article types (OR logic)
When multiple types are provided, they are combined with OR logic. Empty list is silently ignored (no filter added).
- Parameters:
types – List of article type names (case-insensitive)
- Returns:
Self for method chaining
- Return type:
- Raises:
ValueError – If any article type is not recognized
Example
>>> query = SearchQuery().query("treatment").article_types(["RCT", "Meta-Analysis"]) >>> query.build() 'treatment AND (Randomized Controlled Trial[pt] OR Meta-Analysis[pt])'
- build()¶
Build the final PubMed query string
Terms are joined with space separators (PubMed’s default OR logic).
- Returns:
Query string for PubMed E-utilities API
- Return type:
str
- Raises:
ValueError – If no search terms have been added
Example
>>> query = SearchQuery().query("covid-19").query("treatment") >>> query.build() 'covid-19 treatment'
- exclude(excluded)¶
Exclude articles matching the given query
Excludes results from this query that match the excluded query. This is the recommended way to filter out unwanted results. If either query is empty, returns the base query unchanged.
- Parameters:
excluded – SearchQuery representing articles to exclude
- Returns:
New query with exclusion logic
- Return type:
Example
>>> base = SearchQuery().query("cancer treatment") >>> exclude = SearchQuery().query("animal studies") >>> filtered = base.exclude(exclude) >>> filtered.build() '(cancer treatment) NOT (animal studies)'
>>> # Exclude multiple types of studies >>> human_only = SearchQuery().query("therapy") \\ ... .exclude(SearchQuery().query("animal studies")) \\ ... .exclude(SearchQuery().query("in vitro"))
- free_full_text_only()¶
Filter to articles with free full text (open access)
This includes articles that are freely available from PubMed Central and other open access sources.
- Returns:
Self for method chaining
- Return type:
Example
>>> query = SearchQuery().query("cancer").free_full_text_only() >>> query.build() 'cancer AND free full text[sb]'
- full_text_only()¶
Filter to articles with full text links
This includes both free full text and subscription-based full text articles. Use free_full_text_only() if you only want open access articles.
- Returns:
Self for method chaining
- Return type:
Example
>>> query = SearchQuery().query("diabetes").full_text_only() >>> query.build() 'diabetes AND full text[sb]'
- get_limit()¶
Get the limit for this query
Returns the configured limit or the default of 20 if not set.
- Returns:
Maximum number of results (default: 20)
- Return type:
int
Example
>>> query = SearchQuery().query("cancer").limit(100) >>> query.get_limit() 100 >>> query2 = SearchQuery().query("diabetes") >>> query2.get_limit() 20
- group()¶
Add parentheses around the current query for grouping
Wraps the query in parentheses to control operator precedence in complex queries. Returns an empty query if the current query is empty.
- Returns:
New query wrapped in parentheses
- Return type:
Example
>>> query = SearchQuery().query("cancer").or_(SearchQuery().query("tumor")).group() >>> query.build() '((cancer) OR (tumor))'
>>> # Controlling precedence >>> q1 = SearchQuery().query("a").or_(SearchQuery().query("b")).group() >>> q2 = SearchQuery().query("c").or_(SearchQuery().query("d")).group() >>> result = q1.and_(q2) >>> result.build() '(((a) OR (b))) AND (((c) OR (d)))'
- limit(limit=None)¶
Set the maximum number of results to return
Validates that limit is >0 and ≤10,000. None is treated as “use default” (20).
- Parameters:
limit – Maximum number of results (None = use default of 20)
- Returns:
Self for method chaining
- Return type:
- Raises:
ValueError – If limit ≤ 0 or limit > 10,000
Example
>>> query = SearchQuery().query("cancer").limit(50)
- negate()¶
Negate this query using NOT logic
Wraps the current query with NOT operator. This is typically used in combination with other queries to exclude results. Returns an empty query if the current query is empty.
- Returns:
New query with NOT logic
- Return type:
Example
>>> query = SearchQuery().query("cancer").negate() >>> query.build() 'NOT (cancer)'
>>> # More practical: exclude from search results >>> base = SearchQuery().query("treatment") >>> excluded = SearchQuery().query("animal studies").negate() >>> # (Note: use exclude() method for this pattern)
- or_(other)¶
Combine this query with another using OR logic
Combines two queries by wrapping each in parentheses and joining with OR. If either query is empty, returns the non-empty query. The result uses the higher limit of the two queries.
- Parameters:
other – Another SearchQuery to combine with
- Returns:
New query with combined logic
- Return type:
Example
>>> q1 = SearchQuery().query("diabetes") >>> q2 = SearchQuery().query("hypertension") >>> combined = q1.or_(q2) >>> combined.build() '(diabetes) OR (hypertension)'
>>> # Find articles about either condition >>> result = SearchQuery().query("cancer") \\ ... .or_(SearchQuery().query("tumor")) \\ ... .or_(SearchQuery().query("oncology")) >>> result.build() '((cancer) OR (tumor)) OR (oncology)'
- pmc_only()¶
Filter to articles with PMC full text
This filters to articles that have full text available in PubMed Central (PMC).
- Returns:
Self for method chaining
- Return type:
Example
>>> query = SearchQuery().query("genomics").pmc_only() >>> query.build() 'genomics AND pmc[sb]'
- published_after(year)¶
Filter to articles published after a specific year
Equivalent to published_between(year, None).
- Parameters:
year – Year after which articles were published (must be 1800-3000)
- Returns:
Self for method chaining
- Return type:
- Raises:
ValueError – If year is outside the valid range (1800-3000)
Example
>>> query = SearchQuery().query("crispr").published_after(2020) >>> query.build() 'crispr AND 2020:3000[pdat]'
- published_before(year)¶
Filter to articles published before a specific year
Filters articles from 1900 up to and including the specified year.
- Parameters:
year – Year before which articles were published (must be 1800-3000)
- Returns:
Self for method chaining
- Return type:
- Raises:
ValueError – If year is outside the valid range (1800-3000)
Example
>>> query = SearchQuery().query("genome").published_before(2020) >>> query.build() 'genome AND 1900:2020[pdat]'
- published_between(start_year, end_year=None)¶
Filter by publication date range
Filters articles published between start_year and end_year (inclusive). If end_year is None, filters from start_year onwards (up to year 3000).
- Parameters:
start_year – Start year (inclusive, must be 1800-3000)
end_year – End year (inclusive, optional, must be 1800-3000 if provided)
- Returns:
Self for method chaining
- Return type:
- Raises:
ValueError – If years are outside valid range or start_year > end_year
Example
>>> # Filter to 2020-2024 >>> query = SearchQuery().query("cancer").published_between(2020, 2024) >>> query.build() 'cancer AND 2020:2024[pdat]'
>>> # Filter from 2020 onwards >>> query = SearchQuery().query("treatment").published_between(2020, None) >>> query.build() 'treatment AND 2020:3000[pdat]'
- published_in_year(year)¶
Filter to articles published in a specific year
- Parameters:
year – Year to filter by (must be between 1800 and 3000)
- Returns:
Self for method chaining
- Return type:
- Raises:
ValueError – If year is outside the valid range (1800-3000)
Example
>>> query = SearchQuery().query("covid-19").published_in_year(2024) >>> query.build() 'covid-19 AND 2024[pdat]'
- query(term=None)¶
Add a search term to the query
Terms are accumulated (not replaced) and will be space-separated in the final query. None and empty strings (after trimming) are silently filtered out.
- Parameters:
term – Search term string (None or empty strings are filtered)
- Returns:
Self for method chaining
- Return type:
Example
>>> query = SearchQuery().query("covid-19").query("treatment") >>> query.build() 'covid-19 treatment'
- sort(sort_order)¶
Set the sort order for search results
Controls how PubMed orders the search results. The default (when not specified) is relevance-based sorting.
- Parameters:
sort_order – Sort order string (case-insensitive) Supported values: “relevance”, “pub_date” (or “publication_date”), “author” (or “first_author”), “journal” (or “journal_name”)
- Returns:
Self for method chaining
- Return type:
- Raises:
ValueError – If sort order is not recognized
Example
>>> query = SearchQuery().query("cancer").sort("pub_date") >>> # Results will be sorted by publication date (newest first)
- terms(terms=None)¶
Add multiple search terms at once
Each term is processed like query(). None items and empty strings are filtered out.
- Parameters:
terms – List of search term strings
- Returns:
Self for method chaining
- Return type:
Example
>>> query = SearchQuery().terms(["covid-19", "vaccine", "efficacy"]) >>> query.build() 'covid-19 vaccine efficacy'