Changes in version 0.5.0 (2026-06-02) New features - taxobase: Added a built-in reference dataset containing 50 taxonomic clades. The dataset includes pre-computed distance matrices, lineage paths, search outputs, and coverage vectors. It acts as a complete offline fallback, allowing package examples, vignettes, and unit tests to run instantly without requiring a network connection to The Taxonomicon. Bug fixes - Fixed a bug where any taxon whose name contains the substring "planet" (e.g. Periplaneta) was incorrectly discarded by the astronomical homonym filter in get_taxonomicon_id() and taxo_search(). The filter now uses word boundaries (\bplanet\b) so that genus names containing "planet" as an infix are no longer mistakenly treated as non-biological entries. - Fixed a lineage parsing bug affecting genera whose Taxonomicon page does not link back to the genus itself (e.g. Periplaneta, which only links to its constituent species). The lineage is now correctly truncated before child species entries, and the genus name is appended if absent. - Added "Epifamily" to the bare rank token filter so that unprefixed rank labels are not retained as spurious lineage nodes. - Fixed a lineage parsing bug where the superscript type-marker ᵀ (U+1D40) immediately following a taxon name prevented word-boundary matching in the cutoff search, causing the lineage to overshoot the target node and incorrectly include descendant taxa as ancestors. This silently affected any type genus or type species (i.e. the majority of genera), and was particularly visible in deep clades such as Dinosauria where several child clades were being parsed as ancestor nodes. Higher-rank taxa (clades, orders, families — anything above genus level) were disproportionately affected. - Fixed a lineage parsing bug where navigation header text prepended to the #divPageContent block could contain the target taxon name (e.g. in a cited reference title), causing the lineage cutoff to land in the junk header rather than the actual taxonomic tree. Affected taxa received a garbage two- or three-line lineage and were silently discarded by the "Biota" membership check in get_taxonomicon_id(). Drosophila (the fly genus, ID 28940) was one such case. Changes in version 0.4.0 (2026-05-25) New features - cache_info(): inspects the current session cache, reporting the number of cached lineages and IDs, total memory used, and the names of all taxa whose lineages are stored. Returns a named list invisibly for programmatic use. - focal_distances(): computes taxonomic distances from a single focal taxon to all members of a community vector. Returns a sorted data frame with columns taxon, distance, mrca, and mrca_depth, making it easy to identify the closest and most distant relatives of a focal species in a community context. Bug fixes - Fixed a lineage parsing bug where taxa with Subphylum and Infraphylum rank prefixes (e.g., Craniata, Vertebrata) were being incorrectly discarded. The dagger symbol (†) is now stripped before rank prefixes are removed, ensuring that prefixed names like Subphylum † Craniata are correctly parsed to Craniata rather than falling through as bare rank tokens. Subphylum and Infraphylum have been removed from bare_ranks accordingly. - Fixed parsing of auct. author annotations (e.g., Subphylum Craniata auct.) that were preventing rank-prefixed names from being correctly cleaned. - Fixed intermediate clades being silently dropped from lineages when their author strings contained ampersands, hyphens (e.g., Cavalier-Smith), or trailing commas without a year (e.g., Romeriida Gauthier,). The author removal pipeline now handles these patterns correctly. - Fixed a bug where the dagger extinction marker (†) appearing between a rank prefix and a taxon name (e.g., Family † Dromaeosauridae) caused the taxon to be lost entirely. Dagger removal now precedes all other cleaning steps. Changes in version 0.3.0 (2026-05-05) New features - taxo_search(): Added a new interactive search function that queries The Taxonomicon and returns a tidy data frame of all available IDs, ranks, and authors for a given taxon name. - Direct ID Support: get_lineage() and all distance functions (e.g., taxo_distance(), distance_matrix()) now accept Taxonomicon numeric IDs. This provides a fallback for computing distances when homonyms or historical ranks cannot be disambiguated by name alone. Network Handling - Added a 30-second timeout to all HTTP requests to prevent the R session from hanging indefinitely when The Taxonomicon servers are overloaded. - Network failures, offline servers, or bad HTTP statuses now immediately emit a clear, informative warning to the user rather than failing silently. Bug fixes - Extended rank-prefix filter to remove additional bare rank tokens that were being retained as spurious lineage nodes: Subgenus, Section, Division, Subdivision, Supercohort, Infracohort, Subsection, Candidatus, Parvphylum, Branch, and Go to (a navigation artefact from The Taxonomicon page layout). - Duplicate lineage nodes originating from data-quality issues in The Taxonomicon (e.g. Uropygi appearing twice in Thelyphonida, Myxomycetes appearing twice in Physarum) are now collapsed to a single occurrence. A warning is emitted when deduplication occurs so the user is aware of the upstream data issue. - get_taxonomicon_id() now collects all biological matches for a taxon name before returning, and emits a warning when multiple valid biological entries are found in The Taxonomicon. This surfaces homonym ambiguity (e.g. Nereis matching both a polychaete worm and a butterfly genus) that was previously silent. - get_taxonomicon_id() now properly follows taxonomic redirects and removes interface noise (N|T|P...) from warnings. - When multiple biological entries are found (e.g. Bacteria), the warning now lists all available numeric IDs without duplicates, allowing users to make an informed choice. - Made lineage parsing more robust by strictly truncating any philosophical or pre-basal nodes (e.g. organisms) appearing before Biota. - Added validation to get_lineage_by_id() to silently return NULL for non-numeric strings, preventing upstream server fallback errors. - Fixed a lineage parsing issue where the parent Genus was erroneously truncated from species-level queries. Distances and Most Recent Common Ancestors (MRCAs) between congeneric species are now computed accurately. - Fixed an issue where taxonomic author names enclosed in double quotes (e.g., "Redtenbacher, 1906") were evading the lineage cleaning pipeline and being erroneously retained as part of the clade name. Changes in version 0.2.0 (2026-04-21) New functions - taxo_cluster(): hierarchical clustering of taxa by taxonomic distance. - taxo_ordinate(): ordination (PCoA) of taxa in taxonomic distance space. - save_cache(): serialises the session lineage cache to an .rds file. - load_cache(): restores a previously saved cache, avoiding redundant network requests. - taxo_path(): returns the full node-by-node path between two taxa as a tidy data frame, ascending from taxon A to their MRCA and descending to taxon B. Documentation - Added a new vignette (Statistical Applications of taxodist) demonstrating the integration of taxodist with ape (tree plotting) and vegan (taxonomic distinctness, Mantel tests, and PERMANOVA). Minor improvements - Added vegan to Suggests in DESCRIPTION. - Documented the compatibility of distance_matrix() output with vegan functions (taxondive(), mantel(), adonis2()). - Documented the conversion of taxo_cluster() results into phylo objects using ape::as.phylo(). Bug fixes - Fixed incorrect MRCA computation caused by unnamed crown nodes ([crown]) being matched across different lineages as if they were the same ancestor. - Fixed parsing of Taxonomicon rank prefixes (Cohort, Subcohort, Magnorder, Grandorder, Parvorder, Legion) that were being retained as bare rank names instead of the actual clade names (e.g., Placentalia, Boreoeutheria, Galloanserae). Changes in version 0.1.0 (2026-03-23) - Initial release - Compute taxonomic distances between any two taxa using The Taxonomicon - Session-level caching for lineage data to minimize network requests - Functions: taxo_distance(), mrca(), distance_matrix(), closest_relative(), compare_lineages(), shared_clades(), is_member(), filter_clade(), lineage_depth(), check_coverage(), clear_cache()