XPath MCP Server
The XPath MCP Server acts like a high-precision search tool for structured documents. Instead of forcing an AI to read through a whole mess of computer code or massive data files, this tool allows it to pinpoint and extract exactly what it needs from an XML or HTML file. It …
About this Protocol
How to Use
1. Installation
Via Smithery (Automatic for Claude Desktop):
npx -y @smithery/cli install @thirdstrandstudio/mcp-xpath --client claude
Manual Installation (Local Build):
# Install dependencies
npm install
# Build the package
npm run build
2. Configuration
To use this server with Claude Desktop, add the configuration to your claude_desktop_config.json file using one of the following methods:
Method A: Using npx (Recommended)
{
"mcpServers": {
"xpath": {
"command": "npx",
"args": [
"@thirdstrandstudio/mcp-xpath"
]
}
}
}
Method B: Direct Node.js (For local development/builds)
Replace /path/to/mcp-xpath with the actual path to your repository.
{
"mcpServers": {
"xpath": {
"command": "node",
"args": [
"/path/to/mcp-xpath/dist/index.js"
]
}
}
}
3. Available Tools
xpath: Query XML content using XPath expressions.xml(string): The XML content to query.query(string): The XPath query to execute.mimeType(optional, string): The MIME type (e.g.,text/xml,application/xml,text/html,application/xhtml+xml).
xpathwithurl: Fetch content from a URL and query it using XPath expressions.url(string): The URL to fetch XML/HTML content from.query(string): The XPath query to execute.mimeType(optional, string): The MIME type.
4. Example Prompts
Querying local XML/HTML content:
* "Select all <item> elements from this XML: <root><item>value1</item><item>value2</item></root> using the query //item/text()."
* "Find all links in this HTML snippet: <html><body><a href='link1.html'>Link 1</a></body></html> using the query //a/@href."
Querying content from a URL:
* "Get all the links from https://example.com using the XPath query //a/@href with the text/html mimeType."
Use Cases
Use Case 1: Targeted Web Scraping for Product Monitoring
Problem: Users often want to track specific data from a website, such as a product's price, availability, or version number, without downloading or reading through the entire HTML source code of a page.
Solution: The xpathwithurl tool allows an AI to fetch a webpage and extract only the precise data point needed by targeting its specific XPath. This reduces token usage and provides immediate, structured answers.
Example: An AI agent is tasked with checking a competitor's website for a specific price. It uses xpathwithurl with the URL and a query like //span[contains(@class, 'price-amount')]/text() to return only the numerical price value.
Use Case 2: Analyzing Complex XML Configuration Files
Problem: Developers frequently work with massive XML configuration files, such as Maven pom.xml, Android AndroidManifest.xml, or enterprise service configurations. Manually searching for a specific dependency version or a nested setting in a 1,000-line file is error-prone and slow.
Solution: By using the xpath tool, a developer can provide the XML content to the AI and ask it to find specific values using structured queries. The AI can programmatically verify settings or extract lists of dependencies.
Example: "Find the version of the 'spring-core' dependency in this XML." The AI executes //dependency[artifactId='spring-core']/version/text() to retrieve the exact string instantly.
Use Case 3: Automated SEO and Meta-Data Auditing
Problem: SEO specialists and web developers need to verify that pages have correct metadata, such as Open Graph tags, canonical URLs, or specific heading structures (H1-H6), to ensure search engine compliance.
Solution: Using xpathwithurl, the AI can act as an automated auditor. It can crawl a provided URL and specifically pull out SEO-relevant tags to report on their presence or content.
Example: A user asks, "Does this blog post have a proper meta description and an H1 title?" The AI runs two queries: //meta[@name='description']/@content and //h1/text() to provide a concise report.
Use Case 4: Extracting Content from RSS and Atom Feeds
Problem: RSS feeds are highly structured XML, but reading them in raw format is difficult for humans. Users often want to summarize the latest headlines or extract specific links from a news feed without using a dedicated RSS reader.
Solution: This MCP can fetch a live RSS feed and query specific elements like titles, publication dates, or enclosure URLs.
Example: To get the titles of the last five articles from a news site, the AI uses xpathwithurl on the RSS link with the query //item[position() <= 5]/title/text().
Use Case 5: Modernizing Legacy HTML to Clean Data
Problem: When migrating data from an old website or a legacy system that outputs messy HTML, it is difficult to extract clean text for use in a new database or Markdown documentation.
Solution: The xpath tool can be used to "scrape" specific sections of a local HTML snippet, ignoring sidebars, scripts, and ads, to get just the core content.
Example: A user pastes a messy HTML table into the chat. The AI uses //table[@id='data-table']//tr/td[1]/text() to extract only the first column of data, effectively cleaning the data for a CSV export.