#:g1: cl-unicodeの紹介

Posted 2014-12-27 15:00:00 GMT

(LISP Library 365参加エントリ)

 LISP Library 365 の362日目です。

cl-unicodeとはなにか

 cl-unicodeは、Edi Weitz氏作のCommon LispでUnicodeを扱うためのライブラリです。

パッケージ情報

パッケージ名cl-unicode
Quicklisp
ドキュメントCL-UNICODE - A portable Unicode library for Common Lisp
CLiKiCLiki: cl-unicode
Quickdocscl-unicode | Quickdocs
CL Test Grid: ビルド状況cl-unicode | CL Test Grid

インストール方法

(ql:quickload :cl-unicode)

試してみる

 どんな関数があるかは、Quickdocsで確認できます。

 名前のとおりCommon LispでUnicodeを扱うためのライブラリです。
文字からUnicodeのプロパティを取得する等、一通りの操作が可能です。
また、Unicodeが扱える処理系でも文字の名前等は統一されていませんが、cl-unicodeを利用すれば統一的に記述できます。

(defun props (char)
  (remove-if-not (*:curry #'cl-unicode:has-property char)
                 (cl-unicode:recognized-properties)))

(props #\あ) ;=> ("Alphabetic" "Any" "Assigned" "BidiClass:L" "Block:Hiragana" "GraphemeBase" ; "Hiragana" "IDContinue" "IDStart" "L" "Lo" "XIDContinue" "XIDStart")

(mapcar #'string (cl-unicode:list-all-characters "Hiragana")) ;=> ("ぁ" "あ" "ぃ" "い" "ぅ" "う" "ぇ" "え" "ぉ" "お" "か" "が" "き" "ぎ" "く" "ぐ" "け" "げ" "こ" ; "ご" "さ" "ざ" "し" "じ" "す" "ず" "せ" "ぜ" "そ" "ぞ" "た" "だ" "ち" "ぢ" "っ" "つ" "づ" "て" ; "で" "と" "ど" "な" "に" "ぬ" "ね" "の" "は" "ば" "ぱ" "ひ" "び" "ぴ" "ふ" "ぶ" "ぷ" "へ" "べ" ; "ぺ" "ほ" "ぼ" "ぽ" "ま" "み" "む" "め" "も" "ゃ" "や" "ゅ" "ゆ" "ょ" "よ" "ら" "り" "る" "れ" ; "ろ" "ゎ" "わ" "ゐ" "ゑ" "を" "ん" "ゔ" "ゕ" "ゖ" "ゝ" "ゞ" "ゟ")

(mapcar #'string (cl-unicode:list-all-characters "Katakana")) ;=> ("ァ" "ア" "ィ" "イ" "ゥ" "ウ" "ェ" "エ" "ォ" "オ" "カ" "ガ" "キ" "ギ" "ク" "グ" "ケ" "ゲ" "コ" ; "ゴ" "サ" "ザ" "シ" "ジ" "ス" "ズ" "セ" "ゼ" "ソ" "ゾ" "タ" "ダ" "チ" "ヂ" "ッ" "ツ" "ヅ" "テ" ; "デ" "ト" "ド" "ナ" "ニ" "ヌ" "ネ" "ノ" "ハ" "バ" "パ" "ヒ" "ビ" "ピ" "フ" "ブ" "プ" "ヘ" "ベ" ; "ペ" "ホ" "ボ" "ポ" "マ" "ミ" "ム" "メ" "モ" "ャ" "ヤ" "ュ" "ユ" "ョ" "ヨ" "ラ" "リ" "ル" "レ" ; "ロ" "ヮ" "ワ" "ヰ" "ヱ" "ヲ" "ン" "ヴ" "ヵ" "ヶ" "ヷ" "ヸ" "ヹ" "ヺ" "ヽ" "ヾ" "ヿ" "ㇰ" "ㇱ" ; "ㇲ" "ㇳ" "ㇴ" "ㇵ" "ㇶ" "ㇷ" "ㇸ" "ㇹ" "ㇺ" "ㇻ" "ㇼ" "ㇽ" "ㇾ" "ㇿ" "㋐" "㋑" "㋒" "㋓" "㋔" ; "㋕" "㋖" "㋗" "㋘" "㋙" "㋚" "㋛" "㋜" "㋝" "㋞" "㋟" "㋠" "㋡" "㋢" "㋣" "㋤" "㋥" "㋦" "㋧" ; "㋨" "㋩" "㋪" "㋫" "㋬" "㋭" "㋮" "㋯" "㋰" "㋱" "㋲" "㋳" "㋴" "㋵" "㋶" "㋷" "㋸" "㋹" "㋺" ; "㋻" "㋼" "㋽" "㋾" "㌀" "㌁" "㌂" "㌃" "㌄" "㌅" "㌆" "㌇" "㌈" "㌉" "㌊" "㌋" "㌌" "㌍" "㌎" ; "㌏" "㌐" "㌑" "㌒" "㌓" "㌔" "㌕" "㌖" "㌗" "㌘" "㌙" "㌚" "㌛" "㌜" "㌝" "㌞" "㌟" "㌠" "㌡" ; "㌢" "㌣" "㌤" "㌥" "㌦" "㌧" "㌨" "㌩" "㌪" "㌫" "㌬" "㌭" "㌮" "㌯" "㌰" "㌱" "㌲" "㌳" "㌴" ; "㌵" "㌶" "㌷" "㌸" "㌹" "㌺" "㌻" "㌼" "㌽" "㌾" "㌿" "㍀" "㍁" "㍂" "㍃" "㍄" "㍅" "㍆" "㍇" ; "㍈" "㍉" "㍊" "㍋" "㍌" "㍍" "㍎" "㍏" "㍐" "㍑" "㍒" "㍓" "㍔" "㍕" "㍖" "㍗" "ヲ" "ァ" "ィ" ; "ゥ" "ェ" "ォ" "ャ" "ュ" "ョ" "ッ" "ア" "イ" "ウ" "エ" "オ" "カ" "キ" "ク" "ケ" "コ" "サ" "シ" ; "ス" "セ" "ソ" "タ" "チ" "ツ" "テ" "ト" "ナ" "ニ" "ヌ" "ネ" "ノ" "ハ" "ヒ" "フ" "ヘ" "ホ" "マ" ; "ミ" "ム" "メ" "モ" "ヤ" "ユ" "ヨ" "ラ" "リ" "ル" "レ" "ロ" "ワ" "ン")

(print #\㌍) ;>> ;>> #\SQUARE_KARORII ;=> #\SQUARE_KARORII

(cl-unicode:unicode1-name #\㌍) ;=> "SQUARED KARORII"

(cl-unicode:character-named "SQUARED KARORII") ;=> #\SQUARE_KARORII

まとめ

 今回は、cl-unicodeを紹介してみました。
名前で検索して面白い文字を見付けたりするのにも便利ですね。

comments powered by Disqus