`crystal tool format`の話

Twitter: @make_now_just
GitHub: @MakeNowJust
正規表現好き

No.4 Crystal Contributor

Crystal とは？

Ruby っぽい構文とC 言語の速度を兼ね備えたプログラミング言語。

LLVM でネイティブコードにコンパイルする。

a = 1

while true
  case a
  when Int32  then a = "foo"
  when String then break
  end
end

typeof(a) # => String

😎 https://crystal-lang.org/

`crystal`コマンド色々

crystal run foo.cr: コンパイル → 実行
crystal build foo.cr: コンパイル

`crystal`コマンド色々

crystal docs: ドキュメントジェネレータ
crystal play: プレイグランド
crystal tool format: フォーマッター

今回はcrystal tool formatの話

`crystal tool format`

Crystal 公式のフォーマッター
コンパイラのソースコードなどはこれでフォーマットされている

`crystal tool format`

インデント・末尾のコンマなどを修正
元のコードを可能な限り尊重する

[{foo: "foo"}, {foo: "bar"},
 {foo: "baz"}]

[
  {
    foo: "foo",
  },
  {foo: "bar"},
  {
    foo: "baz",
  },
]

普通のフォーマッタの実装方法

ソースコードを構文解析 (このときに位置情報を付加する)
抽象構文木を辿りながらフォーマットしたソースコードを生成する

Crystal のような書き方が複数ある言語で、構文木に全ての情報を付加するのは大変。 (e.g. メソッド呼び出しの括弧の有無、文字列リテラルの区切り文字)

そうでなくとも前述したことをするには位置情報を確認することが多くなって煩雑。

`crystal tool format`での方法

ソースコードを構文解析
字句解析器をもう一度作る
抽象構文木を辿りながら、字句解析器を動かしつつ、フォーマットしたソースコードを生成する

ちなみに実装は、Crystal のリポジトリの、

src/compiler/crystal/tools/formatter.cr

にあります。

驚きの 1 ファイル 4861 行。

e.g. 配列のフォーマット

 1. def format_literal_elements(elements, prefix, suffix)
 2.   slash_is_regex!
 3.   write_token prefix
 4.   has_newlines = false
 5.   wrote_newline = false
 6.   write_space_at_end = false
 7.   next_needs_indent = false
 8.   found_comment = false
 9.   found_first_newline = false
10.   found_comment = skip_space
11.   if found_comment || @token.type == :NEWLINE
12.     # add one level of indentation for contents if a newline is present
13.     offset = @indent + 2
14.     if elements.empty?
15.       skip_space_or_newline
16.       write_token suffix
17.       return false
18.     end
19.     indent(offset) { consume_newlines }
20.     skip_space_or_newline
21.     wrote_newline = true
22.     next_needs_indent = true
23.     has_newlines = true
24.     found_first_newline = true
25.   else
26.     # indent contents at the same column as starting token if no newline
27.     offset = @column
28.   end
29.   elements.each_with_index do |element, i|
30.     current_element = element
31.     if current_element.is_a?(HashLiteral::Entry)
32.       current_element = current_element.key
33.     end
34.     # This is to prevent writing `{{` and `{%`
35.     if prefix == :"{" && i == 0 && !wrote_newline &&
36.        (@token.type == :"{" || @token.type == :"{{" || @token.type == :"{%" ||
37.        @token.type == :"%" || @token.raw.starts_with?("%"))
38.       write " "
39.       write_space_at_end = true
40.     end
41.     if next_needs_indent
42.       write_indent(offset, element)
43.     else
44.       indent(offset, element)
45.     end
46.     has_heredoc_in_line = !@lexer.heredocs.empty?
47.     last = last?(i, elements)
48.     found_comment = skip_space(offset, write_comma: (last || has_heredoc_in_line) && has_newlines)
49.     if @token.type == :","
50.       if !found_comment && (!last || has_heredoc_in_line)
51.         write ","
52.         wrote_comma = true
53.       end
54.       slash_is_regex!
55.       next_token
56.       found_comment = skip_space(offset, write_comma: last && has_newlines)
57.       if @token.type == :NEWLINE
58.         if last && !found_comment && !wrote_comma
59.           write ","
60.           found_comment = true
61.         end
62.         indent(offset) { consume_newlines }
63.         skip_space_or_newline
64.         next_needs_indent = true
65.         has_newlines = true
66.       else
67.         if !last && !found_comment
68.           write " "
69.           next_needs_indent = false
70.         elsif found_comment
71.           next_needs_indent = true
72.         end
73.       end
74.     end
75.   end
76.   finish_list suffix, has_newlines, found_comment, found_first_newline, write_space_at_end
77. end

def format_literal_elements

この方法の問題点

ソースコードに合わせて字句解析器を動かすので、実質的に構文解析器の再実装になる
e.g. 字句解析器のnext_tokenを一つ忘れると結果が大きくずれてよく分からなくなる

この方法の問題点

なので、とてもバグを作りやすい
とてもバグを作りやすい

大事なことなので

他のフォーマッターの実装について書こうとして力尽きた。

crystal tool formatの話

No.4 Crystal Contributor

Crystal とは？

crystalコマンド色々

crystalコマンド色々

crystal tool format

crystal tool format

普通のフォーマッタの実装方法

crystal tool formatでの方法

e.g. 配列のフォーマット

この方法の問題点

この方法の問題点

`crystal tool format`の話

`crystal`コマンド色々

`crystal`コマンド色々

`crystal tool format`

`crystal tool format`

`crystal tool format`での方法