csharp-reverse-markdown

How to convert HTML to Markdown with C# and ReverseMarkdown

  • 2 min

ReverseMarkdown.Net is a library for .NET that allows converting HTML files to Markdown from an application written in C#.

ReverseMarkdown is based on a parser to interpret the structure of the HTML file and generate an object tree representing its content.

Internally, the library uses HtmlAgilityPack (HAP) to generate the DOM of the HTML document and convert it into the equivalent Markdown format.

Furthermore, ReverseMarkdown offers a wide range of options to customize the conversion process.

How to Use ReverseMarkdown.Net

We can easily add the library to a .NET project through the corresponding Nuget package.

Install-Package ReverseMarkdown

Here are some examples of how to use ReverseMarkdown.Net, extracted from the library’s documentation.

var converter = new ReverseMarkdown.Converter();

string html = "This a sample <strong>paragraph</strong> from <a href=\"http://test.com\">my site</a>";

string markdown = converter.Convert(html);
Copied!

If we wanted to convert an existing file, we simply need to read its content first and pass it to the converter.

string html = File.ReadAllText("document.html");
var converter = new ReverseMarkdown.Converter();
string markdown = convertidor.Convert(html);
Copied!

The configuration process can be customized to our needs using the config option.

var config = new ReverseMarkdown.Config
{
    // Include the unknown tag completely in the result (default as well)
    UnknownTags = Config.UnknownTagsOption.PassThrough,
    // generate GitHub flavoured markdown, supported for BR, PRE and table tags
    GithubFlavored = true,
    // will ignore all comments
    RemoveComments = true,
    // remove markdown output for links where appropriate
    SmartHrefHandling = true
};

var converter = new ReverseMarkdown.Converter(config);
Copied!