Rust 中 Syntect 如何自定义生成的 HTML 代码

2024.09.25

本文仅讨论自定义最终生成的 HTML 文本,如代码行是否用标签包含,code 标签外添加的 pre 标签怎么自定义样式,不包含自定义主题。

syntect 代码库提供的案例中,将代码文件转成行内样式的 HTML 文本的实现如下:

rust
//! Prints highlighted HTML for a file to stdout.
//! Basically just wraps a body around `highlighted_html_for_file`
use syntect::highlighting::{Color, ThemeSet};
use syntect::html::highlighted_html_for_file;
use syntect::parsing::SyntaxSet;

fn main() {
    let ss = SyntaxSet::load_defaults_newlines();
    let ts = ThemeSet::load_defaults();

    let args: Vec<String> = std::env::args().collect();
    if args.len() < 2 {
        println!("Please pass in a file to highlight");
        return;
    }

    let style = "
        pre {
            font-size:13px;
            font-family: Consolas, \"Liberation Mono\", Menlo, Courier, monospace;
        }";
    println!(
        "<head><title>{}</title><style>{}</style></head>",
        &args[1], style
    );
    let theme = &ts.themes["base16-ocean.dark"];
    let c = theme.settings.background.unwrap_or(Color::WHITE);
    println!(
        "<body style=\"background-color:#{:02x}{:02x}{:02x};\">\n",
        c.r, c.g, c.b
    );
    let html = highlighted_html_for_file(&args[1], &ss, theme).unwrap();
    println!("{}", html);
    println!("</body>");
}

阅读代码发现,大部分的代码是加载一些预设,然后输出模板,最终将代码文件转成 HTML 文本的代码是 highlighted_html_for_file 方法,经查看,该方法的源码如下:

rust
pub fn highlighted_html_for_file<P: AsRef<Path>>(
    path: P,
    ss: &SyntaxSet,
    theme: &Theme,
) -> Result<String, Error> {
    let mut highlighter = HighlightFile::new(path, ss, theme)?;
    let (mut output, bg) = start_highlighted_html_snippet(theme);

    let mut line = String::new();
    while highlighter.reader.read_line(&mut line)? > 0 {
        {
            let regions = highlighter.highlight_lines.highlight_line(&line, ss)?;
            append_highlighted_html_for_styled_line(
                &regions[..],
                IncludeBackground::IfDifferent(bg),
                &mut output,
            )?;
        }
        line.clear();
    }
    output.push_str("</pre>\n");
    Ok(output)
}

有同类型的方法 highlighted_html_for_string 源码如下:

rust
pub fn highlighted_html_for_string(
    s: &str,
    ss: &SyntaxSet,
    syntax: &SyntaxReference,
    theme: &Theme,
) -> Result<String, Error> {
    let mut highlighter = HighlightLines::new(syntax, theme);
    let (mut output, bg) = start_highlighted_html_snippet(theme);

    for line in LinesWithEndings::from(s) {
        let regions = highlighter.highlight_line(line, ss)?;
        append_highlighted_html_for_styled_line(
            &regions[..],
            IncludeBackground::IfDifferent(bg),
            &mut output,
        )?;
    }
    output.push_str("</pre>\n");
    Ok(output)
}

对比可以发现,生成 HTML 文本主要的两个方法有:highlighter.highlight_lineappend_highlighted_html_for_styled_line。那么我们就可以使用这两个方法自己来实现最终生成的 HTML 文本:

rust
pub mod code_highlight{
  use syntect::easy::HighlightLines;
  use syntect::Error;
  use syntect::highlighting::ThemeSet;
  use syntect::html::{append_highlighted_html_for_styled_line, IncludeBackground};
  use syntect::parsing::SyntaxSet;
  use syntect::util::LinesWithEndings;

  pub fn highlight(code: &str, lang: &str) -> Result<String, Error>{
    let ps = SyntaxSet::load_defaults_newlines();
    let ts = ThemeSet::load_defaults();
    let theme = &ts.themes["InspiredGitHub"];
    let syntax = ps.find_syntax_by_extension(lang).unwrap();
    let mut highlighter = HighlightLines::new(syntax, theme);
    let mut output = String::from("<pre>");
    for line in LinesWithEndings::from(code) {
      let regions = highlighter.highlight_line(line, &ps)?;
      append_highlighted_html_for_styled_line(
        &regions[..],
        IncludeBackground::No, // 不在代码标签中添加 `style:"background-color: xxxx"`
        &mut output,
      )?;
    }
    output.push_str("</pre>\n");
    Ok(output)
  }
}

#[cfg(test)]
mod test{
  use syntect::highlighting::ThemeSet;
  use super::code_highlight;
  #[test]
  fn test_my_highlight(){
    let result = code_highlight::highlight("let a = 1 + 2;\nconsole.log(a)", "js")
      .unwrap_or_else(|_| String::from("error"));
    print!("{}", result)
  }

  #[test]
  fn traversal_theme(){
    let ts = ThemeSet::load_defaults();
    for (name, _the) in ts.themes.iter(){
      println!("{}", name)
    }
  }
}

运行 test_my_highlight 方法的结果为:

html
<pre><span style="font-weight:bold;color:#a71d5d;">let </span><span style="color:#323232;">a </span><span style="font-weight:bold;color:#a71d5d;">= </span><span style="color:#0086b3;">1 </span><span style="font-weight:bold;color:#a71d5d;">+ </span><span style="color:#0086b3;">2</span><span style="color:#323232;">;
</span><span style="color:#795da3;">console</span><span style="color:#323232;">.</span><span style="color:#0086b3;">log</span><span style="color:#323232;">(a)</span></pre>

这里的代码有多余的换行符,可以自行处理

观察运行结果,不难发现可以通过修改 output 变量来实现想要的效果。如果想要在代码行外添加 p 标签,只需要在循环中加入文本即可:

rust
for line in LinesWithEndings::from(code) {
  let regions = highlighter.highlight_line(line, &ps)?;
  output.push_str("<p>");
  append_highlighted_html_for_styled_line(
    &regions[..],
    IncludeBackground::No,
    &mut output,
  )?;
  output.push_str("</p>")
}

对于代码高亮的主题, syntect 中的默认主题包中包含如下主题:

text
InspiredGitHub
Solarized (dark)
Solarized (light)
base16-eighties.dark
base16-mocha.dark
base16-ocean.dark
base16-ocean.light