IronWiki
A production-ready MediaWiki wikitext parser and renderer for .NET. Parse wikitext into a full AST, render to HTML/Markdown/PlainText, expand templates with 40+ parser functions, and extract document metadata.
Features
- Complete Wikitext Parsing - Full AST with source span tracking for all MediaWiki syntax
- Multiple Renderers - HTML, Markdown (GitHub-flavored), and PlainText output
- Template Expansion - Recursive expansion with 40+ parser functions (
#if,#switch,#expr,#time, etc.) - Document Analysis - Extract categories, sections, TOC, references, links, images, and templates
- Security-First - HTML sanitization, XSS prevention, safe tag whitelisting
- Extensible - Interfaces for custom template resolvers and image handlers
- Modern .NET - Targets .NET 9.0 with nullable reference types and async support
Installation
dotnet add package MarketAlly.IronWiki
Or via the NuGet Package Manager:
Install-Package MarketAlly.IronWiki
Quick Start
Basic Parsing and Rendering
using MarketAlly.IronWiki.Parsing;
using MarketAlly.IronWiki.Rendering;
// Parse wikitext
var parser = new WikitextParser();
var document = parser.Parse("'''Hello''' ''World''! See [[Main Page]].");
// Render to HTML
var htmlRenderer = new HtmlRenderer();
string html = htmlRenderer.Render(document);
// Output: <p><b>Hello</b> <i>World</i>! See <a href="/wiki/Main_Page">Main Page</a>.</p>
// Render to Markdown
var markdownRenderer = new MarkdownRenderer();
string markdown = markdownRenderer.Render(document);
// Output: **Hello** *World*! See [Main Page](/wiki/Main_Page).
// Render to plain text
string plainText = document.ToPlainText();
// Output: Hello World! See Main Page.
Template Expansion
using MarketAlly.IronWiki.Parsing;
using MarketAlly.IronWiki.Rendering;
var parser = new WikitextParser();
// Create a template content provider
var provider = new DictionaryTemplateContentProvider();
provider.Add("Greeting", "Hello, {{{1|World}}}!");
provider.Add("Infobox", @"
{| class=""infobox""
|-
! {{{title}}}
|-
| Type: {{{type|Unknown}}}
|}");
// Expand templates
var expander = new TemplateExpander(parser, provider);
var document = parser.Parse("{{Greeting|Alice}} {{Infobox|title=Example|type=Demo}}");
string result = expander.Expand(document);
Parser Functions
IronWiki supports 40+ MediaWiki parser functions:
var parser = new WikitextParser();
var provider = new DictionaryTemplateContentProvider();
var expander = new TemplateExpander(parser, provider);
// Conditionals
var doc1 = parser.Parse("{{#if: yes | True | False}}");
expander.Expand(doc1); // "True"
// String case
var doc2 = parser.Parse("{{uc:hello world}}");
expander.Expand(doc2); // "HELLO WORLD"
// Math expressions
var doc3 = parser.Parse("{{#expr: 2 + 3 * 4}}");
expander.Expand(doc3); // "14"
// Switch statements
var doc4 = parser.Parse("{{#switch: b | a=First | b=Second | c=Third}}");
expander.Expand(doc4); // "Second"
// String manipulation
var doc5 = parser.Parse("{{#len:Hello}}");
expander.Expand(doc5); // "5"
Supported Parser Functions:
- Conditionals:
#if,#ifeq,#ifexpr,#ifexist,#iferror,#switch - String Case:
lc,uc,lcfirst,ucfirst - String Functions:
#len,#pos,#rpos,#sub,#replace,#explode,#pad,padleft,padright - URL Functions:
#urlencode,#urldecode,#anchorencode,fullurl,localurl - Title Functions:
#titleparts,ns - Date/Time:
#time,#timel,currentyear,currentmonth,currentday,currenttimestamp, etc. - Math:
#expr(full expression evaluator with+,-,*,/,^,mod, parentheses) - Formatting:
formatnum,plural - Misc:
#tag,!(pipe escape)
Document Analysis
Extract metadata from parsed documents:
using MarketAlly.IronWiki.Parsing;
using MarketAlly.IronWiki.Analysis;
var parser = new WikitextParser();
var analyzer = new DocumentAnalyzer();
var document = parser.Parse(@"
#REDIRECT [[Target Page]]
");
// Or analyze a full article
var article = parser.Parse(@"
== Introduction ==
This article is about [[Topic]].
{{Infobox|title=Example}}
[[File:Example.jpg|thumb|A caption]]
== Details ==
More content here.<ref name=""source1"">Citation text</ref>
=== Subsection ===
Additional details.<ref>Another citation</ref>
== References ==
<references/>
[[Category:Examples]]
[[Category:Documentation|IronWiki]]
");
var metadata = analyzer.Analyze(article);
// Check for redirect
if (metadata.IsRedirect)
{
Console.WriteLine($"Redirects to: {metadata.Redirect.Target}");
}
// Categories
foreach (var category in metadata.Categories)
{
Console.WriteLine($"Category: {category.Name}, Sort Key: {category.SortKey}");
}
// Sections and Table of Contents
foreach (var section in metadata.Sections)
{
Console.WriteLine($"Section: {section.Title} (Level {section.Level}, Anchor: {section.Anchor})");
}
// References
foreach (var reference in metadata.References)
{
Console.WriteLine($"Ref #{reference.Number}: {reference.Content}");
}
// Links
Console.WriteLine($"Internal links: {metadata.InternalLinks.Count}");
Console.WriteLine($"External links: {metadata.ExternalLinks.Count}");
Console.WriteLine($"Images: {metadata.Images.Count}");
Console.WriteLine($"Templates: {metadata.Templates.Count}");
// Unique values
var linkedArticles = metadata.LinkedArticles; // Unique article titles
var templateNames = metadata.TemplateNames; // Unique template names
var imageFiles = metadata.ImageFileNames; // Unique image filenames
Custom Template Resolution
Integrate with your own template sources:
using MarketAlly.IronWiki.Rendering;
using MarketAlly.IronWiki.Nodes;
// Implement ITemplateContentProvider for raw wikitext
public class DatabaseTemplateProvider : ITemplateContentProvider
{
private readonly IDatabase _db;
public string? GetContent(string templateName)
{
return _db.GetTemplateWikitext(templateName);
}
public async Task<string?> GetContentAsync(string templateName, CancellationToken ct)
{
return await _db.GetTemplateWikitextAsync(templateName, ct);
}
}
// Or implement ITemplateResolver for pre-rendered content
public class ApiTemplateResolver : ITemplateResolver
{
public string? Resolve(Template template, RenderContext context)
{
// Call external API to expand template
return CallMediaWikiApi(template);
}
}
// Chain multiple providers with fallback
var provider = new ChainedTemplateContentProvider(
new MemoryCacheProvider(cache),
new DatabaseTemplateProvider(db),
new WikiApiProvider(httpClient)
);
var expander = new TemplateExpander(parser, provider);
Custom Image Resolution
Handle image URLs for your environment:
using MarketAlly.IronWiki.Rendering;
// Simple pattern-based resolver
var imageResolver = new UrlPatternImageResolver(
"https://upload.wikimedia.org/wikipedia/commons/{0}"
);
var renderer = new HtmlRenderer(imageResolver: imageResolver);
// Or implement custom logic
public class CustomImageResolver : IImageResolver
{
public string? ResolveUrl(string fileName, int? width, int? height)
{
var hash = ComputeMd5Hash(fileName);
return $"https://cdn.example.com/{hash[0]}/{hash[0..2]}/{fileName}";
}
}
HTML Rendering Options
var options = new HtmlRenderOptions
{
// Link generation
ArticleUrlTemplate = "/wiki/{0}",
// Template handling when no resolver provided
TemplateOutputMode = TemplateOutputMode.Placeholder, // or Comment, Skip
// Image handling when no resolver provided
ImageOutputMode = ImageOutputMode.AltText, // or Placeholder, Skip
// Table of contents
GenerateTableOfContents = true,
TocMinHeadings = 4,
// Security (defaults are secure)
AllowRawHtml = false,
AllowedHtmlTags = ["span", "div", "abbr", "cite", "code", "data", "mark", "q", "s", "small", "sub", "sup", "time", "u", "var"],
DisallowedAttributes = ["style", "class", "id"]
};
var renderer = new HtmlRenderer(options);
Async Support
All major operations support async/await:
// Async template expansion
var result = await expander.ExpandAsync(document, cancellationToken);
// Async template resolution
var resolver = new AsyncTemplateResolver();
var html = await resolver.ResolveAsync(template, context, cancellationToken);
// Async content provider
var content = await provider.GetContentAsync("Template:Example", cancellationToken);
JSON Serialization
Serialize and deserialize the AST:
using MarketAlly.IronWiki.Serialization;
// Serialize to JSON
var json = WikiJsonSerializer.Serialize(document, writeIndented: true);
// Or use extension method
var json2 = document.ToJson();
// Deserialize back
var restored = WikiJsonSerializer.DeserializeDocument(json);
Error Handling
The parser provides diagnostics instead of throwing exceptions for malformed input:
var diagnostics = new List<ParsingDiagnostic>();
var document = parser.Parse(wikitext, diagnostics);
foreach (var diagnostic in diagnostics)
{
Console.WriteLine($"{diagnostic.Severity}: {diagnostic.Message} at position {diagnostic.Span}");
}
Supported Wikitext Syntax
| Feature | Status | Notes |
|---|---|---|
| Formatting | Full | Bold, italic, combined |
| Headings | Full | Levels 1-6 |
| Links | Full | Internal, external, interwiki, categories |
| Images | Full | All parameters (size, alignment, frame, caption) |
| Lists | Full | Ordered, unordered, definition lists |
| Tables | Full | Full syntax with attributes |
| Templates | Full | With parameter substitution |
| Parser Functions | 40+ | See list above |
| Parser Tags | Full | ref, references, nowiki, code, pre, math, gallery, etc. |
| HTML Tags | Sanitized | Safe subset with attribute filtering |
| Comments | Full | HTML comments |
| Magic Words | Partial | Date/time, namespaces |
| Redirects | Full | Detection and extraction |
Architecture
MarketAlly.IronWiki/
├── Parsing/
│ ├── WikitextParser.cs # Main parser entry point
│ ├── ParserCore.cs # Core parsing engine
│ └── ParsingDiagnostic.cs # Error reporting
├── Nodes/
│ ├── WikiNode.cs # Base AST node
│ ├── BlockNodes.cs # Paragraphs, headings, lists
│ ├── InlineNodes.cs # Text, links, formatting
│ └── TableNodes.cs # Table structure
├── Rendering/
│ ├── HtmlRenderer.cs # HTML output
│ ├── MarkdownRenderer.cs # Markdown output
│ ├── PlainTextRenderer.cs # Text extraction
│ ├── TemplateExpander.cs # Template processing
│ ├── ITemplateResolver.cs # Template resolution interface
│ └── IImageResolver.cs # Image URL interface
├── Analysis/
│ ├── DocumentAnalyzer.cs # Metadata extraction
│ └── DocumentMetadata.cs # Metadata models
└── Serialization/
└── WikiJsonSerializer.cs # JSON AST serialization
Performance
- Single-pass parsing - Efficient recursive descent parser
- Object pooling - Reuses parser instances
- Async support - Non-blocking I/O for template resolution
- Lazy evaluation - Deferred processing where possible
- StringBuilder - Efficient string building throughout
Security
IronWiki is designed with security in mind:
- HTML Sanitization - Only whitelisted tags allowed
- Attribute Filtering - Blocks
on*event handlers,javascript:URLs - XSS Prevention - Proper escaping of all user content
- Safe Defaults - Secure configuration out of the box
Acknowledgments
This project draws significant inspiration from MwParserFromScratch by CXuesong. The original project provided an excellent foundation for understanding MediaWiki wikitext parsing in .NET. IronWiki builds upon these concepts with:
- Modern .NET 9.0 target
- Enhanced template expansion with 40+ parser functions
- Multiple renderer implementations (HTML, Markdown, PlainText)
- Comprehensive document analysis and metadata extraction
- Production-ready security features
We are grateful to CXuesong for their pioneering work in this space.
License
This project is licensed under the MIT License - see the LICENSE file for details.
MIT License
Copyright (c) 2024-2025 MarketAlly LLC
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Author
David H Friedel Jr. - MarketAlly LLC
Links
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Please make sure to update tests as appropriate.