pub fn extract_categories(content: &str) -> Vec<String>
Extract article categories from <article-categories>/<subj-group>/<subject>
<article-categories>/<subj-group>/<subject>