Language: EN

csharp-reverse-markdown

How to convert HTML to Markdown with C# and ReverseMarkdown

ReverseMarkdown.Net is a library for .NET that allows you to convert HTML files to Markdown from a C# application.

ReverseMarkdown is based on a syntactic analyzer to interpret the structure of the Markdown file and generate a tree of objects that represents its content.

Internally, the library uses HtmlAgilityPack (HAP) to generate the DOM of the HTML document, and convert it into the equivalent Markdown format.

In addition, ReverseMarkdown has a wide range of options to customize the conversion process.

To be able to edit markdown files and convert to HTML you can use the library Markdig

How to use ReverseMarkdown.Net

We can easily add the library to a .NET project, through the corresponding Nuget package.

Install-Package ReverseMarkdown

Here are some examples of how to use ReverseMarkdown.Net extracted from the library’s documentation

var converter = new ReverseMarkdown.Converter();

string html = "This a sample <strong>paragraph</strong> from <a href=\"http://test.com\">my site</a>";

string markdown = converter.Convert(html);

If we wanted to convert an existing file, we simply have to read its content beforehand and pass it to the converter.

string html = File.ReadAllText("document.html");
var converter = new ReverseMarkdown.Converter();
string markdown = convertidor.Convert(html);

The configuration process can be customized to our needs using the config option

var config = new ReverseMarkdown.Config
{
    // Include the unknown tag completely in the result (default as well)
    UnknownTags = Config.UnknownTagsOption.PassThrough,
    // generate GitHub flavoured markdown, supported for BR, PRE and table tags
    GithubFlavored = true,
    // will ignore all comments
    RemoveComments = true,
    // remove markdown output for links where appropriate
    SmartHrefHandling = true
};

var converter = new ReverseMarkdown.Converter(config);

As we can see, it is very simple to convert an HTML file. ReverseMarkdown.Net is Open Source, and all the code and documentation is available in the project’s repository at https://github.com/mysticmind/reversemarkdown-net