Module xml_utils

Module xml_utils 

Expand description

Common XML parsing utilities shared between PubMed and PMC parsers

This module provides reusable XML parsing functions for both string-based and serde-based XML parsing workflows.

Functionsยง

decode_xml_entities
Decode XML character entities in a string
extract_all_attributes
Extract all attributes from an XML tag
extract_all_text_between
Extract content between tags for all occurrences
extract_attribute_value
Extract attribute value from XML tag
extract_element_content
Extract element content with its tag name
extract_section_text
Extract text content from a section, handling nested tags
extract_text_between
Extract text between two XML tags
extract_text_between_ref
Extract text between two XML tags as a borrowed string slice
find_all_tags
Find all occurrences of a tag in content
is_self_closing_tag
Check if a tag is self-closing
strip_inline_html_tags
Strip inline HTML-like formatting tags from XML content
strip_xml_tags
Strip XML tags from content