ãã®ææžã®å 容ã¯ãæ€çŽ¢çšã« Feishu ææžããã³ããŒããããã®ã§ãå 容ã®ãã©ãŒããããäºææ§ããªãå ŽåããããŸããå ã®Feishu ææžãåç §ããããšããå§ãããŸãã
èæ¯#
æè¿ãmbrubeckãããæžããRobinsonã«åŸã£ãŠãRust ã䜿ã£ãŠã·ã³ãã«ãªãã©ãŠã¶ãšã³ãžã³ãäœæããããšãåŠãã§ããŸãïŒåŸã§æžãçµããããææžãäœæããŠç޹ä»ããŸãïŒããã®éçšã§ãHTMLãCSS ãªã©ã®ãã©ãŒããããã¡ã€ã«ãè§£æããå¿ èŠããããããé¢é£ããããŒãµãŒãäœæããå¿ èŠããããŸãã
0 ãã 1 ã®ææžãããŒãµãŒã¯éåžžã«éå±ã§ééããããäœæ¥ã§ãããªããªããå ·äœçã«è§£æããå¿ èŠããããããã³ã«ã®ã«ãŒã«ãèæ ®ããã ãã§ãªããããŒãµãŒã®ãšã©ãŒãã³ããªã³ã°ãæ¡åŒµæ§ãè§£ææ§èœãªã©ãèæ ®ããå¿ èŠãããããã§ãããã®ãããèšäºã®äžã§ããçŸåšååšãã pest ãªã©ã®ãµãŒãããŒãã£ã©ã€ãã©ãªã䜿çšããŠæé©åããããšããå§ãããŸãã
ç§ã®æ¥åžžã®éçºäœæ¥ãæ¯ãè¿ããšãããŒãµãŒãæ§ç¯ããå¿ èŠãããã·ãŒã³ã¯å°ãªããéåžžãJSONãCSV ãªã©ã®ãã©ãŒãããããããã³ã«ã®æ å ±ãè§£æããå¿ èŠãããå Žåãå¹çã远æ±ããŠããã®ãã©ãŒãããããããã³ã«ã«ç¹åãããµãŒãããŒãã£ã©ã€ãã©ãªãçŽæ¥äœ¿çšããŸããããããå®éã«ã¯ãã¹ãŠã®ãããã³ã«ããã©ãŒãããã«ä»ã®äººãæžããããŒãµãŒãããããã§ã¯ãããŸãããç¹ã«ããŸããŸãªãããã¯ãŒã¯éä¿¡ãããã³ã«ãªã©ã«é¢ããŠã¯ãæžãããããŒãµãŒãã«ã¹ã¿ãã€ãºãéåžžã«é£ããããããã®æ©äŒãå©çšããŠãããŒãµãŒãåŠã³ãããŒãµãŒãæ§ç¯ããé¢é£å®è·µãçè§£ããä»åŸäŒŒããããªã·ãŒã³ã§ã®äœ¿çšã䟿å©ã«ããããšãã§ããŸãã
泚æïŒ
- ãã®ææžã§ã¯ãããŒãµãŒãããŒãµãŒã©ã€ãã©ãªã®é¢é£åçãæ·±ãæãäžããããšã¯ãããŸãããäž»ã«ãããŒãµãŒå ¥éã®çè§£ãšå®è·µãã«ã€ããŠã§ãã
- ãã®ã·ãªãŒãºã®ææžã¯äžäž 2 ç¯ã«åãããŠãããäžç¯ã§ã¯ããŒãµãŒã®çè§£ãšãµãŒãããŒãã£ã©ã€ãã©ãªã®äœ¿çšã«ã€ããŠãäžç¯ã§ã¯å ·äœçãªå®è·µã«ã€ããŠèª¬æããŸãã
- ãã®ã·ãªãŒãºã®ææžã«ç»å ŽãããœãŒã¹ã³ãŒãã¯ãhttps://github.com/catwithtudou/parser_toy ã§ç¢ºèªã§ããŸãã
åæç¥è#
以äžã§ã¯ãããŒãµãŒã«é¢é£ããåæç¥èãããã€ã玹ä»ããåŸã®çè§£ãå©ããŸãã
ããŒãµãŒ#
ããã§èšãããŒãµãŒïŒParserïŒã¯ãå®éã«ã¯ããåºãå®çŸ©ã§ãããéåžžã¯ç¹å®ã®ãã©ãŒãããã®æ å ±ãç¹å®ã®ããŒã¿æ§é ã«å€æããã³ã³ããŒãã³ããæããŸãã
ãããã©ãŒãããã®æ å ±ãããã³ãŒããããæŽçãããããŒã¿æ§é æ å ±ã«æœè±¡åããŠãæ å ±ã®çè§£ãšåŠçã容æã«ããããšã«äŒŒãŠããŸãã
äŸãæãããšãç®æ°ã®è¡šçŸåŒã®ããã¹ã "1 + 2" ããããããã°ã©ã ãéããŠçµæãèšç®ã§ããããšãæåŸ ããŠããŸãã
ããã°ã©ã ãç®è¡è¡šçŸãèªèã§ããããã«ããããã«ãç®è¡è¡šçŸã®ããŒãµãŒãéã㊠(left,op,right) ã®æ§é äœã«å€æããŠèšç®ããŸãã
ã³ã³ãã¥ãŒã¿åéã«ãããŠãããŒã¿åŠçã®éçšã§ããŒãµãŒã¯äžå¯æ¬ ã§ãããããŸããŸãªããŒã¿åŠçã·ãŒã³ã§å¿çšã§ããŸããäŸãã°ãäžè¬çãªãã®ãšããŠïŒ
- äœã¬ãã«ã®ã³ã³ãã€ã©ãã€ã³ã¿ããªã¿ã®ããŒãµãŒã¯ãäž»ã«ãœãŒã¹ã³ãŒããæ§æè§£æããæœè±¡æ§ææš AST ãæœåºãã圹å²ãæãããŸãã
- Web ã¢ããªã±ãŒã·ã§ã³ã§ãã䜿çšãããããŒã¿äº€æãã©ãŒããã JSON ã®ããã¹ãã¯ã察å¿ããããŒãµãŒãéããŠå¿ èŠãªããŒã¿æ§é ãã·ãªã¢ã©ã€ãºããŠåŠçããŸãã
- ãã®ä»ããããã¯ãŒã¯éä¿¡ãããã³ã«ãã¹ã¯ãªããèšèªãããŒã¿ããŒã¹èšèªãªã©ãè§£æããããã®ããŒãµãŒã䜿çšããŸãã
PEG#
PEGïŒParsing Expression GrammarïŒã玹ä»ããåã«ãïŒä»®å®ãšããŠïŒããäžè¬çãªæ£èŠè¡šçŸã䜿ã£ãŠçè§£ãæ·±ããããšãã§ããŸãã
æ£èŠè¡šçŸãš PEG ã®é¢ä¿ã¯ãæåããã¹ããåŠçããéã«ç¹å®ã®æ§æã䜿çšããŠæåããã¹ãããããã³ã°ããã³è§£æããç¹ã§å ±éããŠããŸãããç°ãªãç¹ã¯ä»¥äžã®éãã§ãïŒ
- ãæ§æé¢ãåè ã¯æååã®ãã¿ãŒã³ãèšè¿°ããããã®ç¹å®ã®æ§æã䜿çšããéåžžã¯åçŽãªæååãããã³ã°ãæ€çŽ¢ã«äœ¿çšãããŸããäžæ¹ãåŸè ã¯ããè€éãªæ§æã䜿çšããŠèšèªæ§é ãèšè¿°ããéåžžã¯è€éãªèšèªè§£æãåæã®ããŒãºã«äœ¿çšãããŸãã
- ãå¿çšåéãåè ã¯äž»ã«åçŽãªããã¹ãåŠçã®ããŒãºã«äœ¿çšãããç¹å®ã®ãã¿ãŒã³ã®ããã¹ããæ€çŽ¢ããããå ¥åãã©ãŒããããæ€èšŒãããããŸããäžæ¹ãåŸè ã¯è€éãªèšèªæ§é ãåŠçããããã«äœ¿çšãããããã°ã©ãã³ã°èšèªã®æ§æè§£æãã€ã³ã¿ããªã¿ã®æ§ç¯ã«äœ¿çšãããŸãã
玹ä»ãéããŠãçãã㯠PEG ã«ã€ããŠç°¡åã«çè§£ã§ãããšæããŸãã
ãªã PEG ã玹ä»ããããšãããšãPEG ã䜿çšããŠå®çŸã§ããããŒã«ïŒParser Generator ãšåŒã°ããïŒãéããŠã«ã¹ã¿ãã€ãºãããããŒãµãŒãå®çŸã§ããããã§ãã
次ã«ãPEGïŒParsing Expression GrammarïŒã«ã€ããŠç°¡åã«æ£åŒã«ç޹ä»ããŸãïŒ
- PEGïŒParsing Expression GrammarïŒæŠèŠ
Parsing Expression Grammarã®ç¥ç§°ã¯PEGïŒè±èªïŒParsing Expression GrammarïŒïŒ
- è§£æåã®åœ¢åŒææ³ã§ãã2004 幎㫠Bryan Ford ã«ãã£ãŠææ¡ããã1970 幎代ã«å°å ¥ããããããããŠã³æ§æè§£æèšèªã®ãã¡ããªãŒã«é¢é£ããŠããŸãã
- èšèªæ§é ãèšè¿°ããããã®æ§æãšããŠãæ£èŠè¡šçŸãããããè€éãªèšèªæ§é ãåŠçã§ããååž°çãªç¹åŸŽã«ããç¡éã«ãã¹ããããæ§é ãèšè¿°ã§ããŸãã
- ã·ã³ãã«ã§æè»ãªæ¹æ³ã§æ§æã«ãŒã«ãå®çŸ©ãããã®ã«ãŒã«ã䜿çšããŠå ¥åæååãè§£æããæ§ææšãçæã§ããŸãã
- 䜿ãããããæ£ç¢ºæ§ãæ§èœã®å©ç¹ãããããšã©ãŒã¬ããŒããåå©çšå¯èœãªã«ãŒã«ãã³ãã¬ãŒããªã©ã®æ©èœãæäŸãããããããã¹ãã®è§£æãåæã«åºã䜿çšãããŠããŸãã
- PEG ã®å¿çšæŠèŠ
PEG ã®æ§æã¯ããã°ã©ãã³ã°èšèªã«äŒŒãŠãããæŒç®åãšã«ãŒã«ã䜿çšããŠèšèªæ§é ãèšè¿°ããŸãïŒ
-
æŒç®åã«ã¯ã|ãïŒãŸãã¯ïŒãã&ãïŒããã³ïŒããïŒãïŒãªãã·ã§ã³ïŒãªã©ãå«ãŸããã«ãŒã«ã¯èšèªã®å ·äœçãªæ§é ãèšè¿°ããããã«äœ¿çšãããŸãã
-
äŸãã°ã以äžã¯æŽæ°ã®æ§æãèšè¿°ããã·ã³ãã«ãª PEG ã«ãŒã«ã§ãïŒ
int := [0-9]+
å¹ççãªããŒãµãŒã³ãŒãã«çŽæ¥å€æã§ãããããçŸåšã§ã¯å€ãã®äœã¬ãã«ã§ PEG ã䜿çšããŠå®çŸãããããŒãµãŒãååšããŸããäŸãã°ãANTLRãPEG.js ãªã©ã§ãã
ããŒãµãŒã³ã³ãããŒã¿ãŒ#
åè¿°ã®ããŒãµãŒã«ã€ããŠã®çè§£ãéããŠãããŒãµãŒã³ã³ãããŒã¿ãŒïŒParser CombinatorïŒãçè§£ããã®ã¯æ¯èŒç容æã§ãã
- ããŒãµãŒã³ã³ãããŒã¿ãŒã®å®çŸ©ãšææ³
ç°¡åã«èšããšãããŒãµãŒã³ã³ãããŒã¿ãŒã¯ããŸããŸãªããŒãµãŒã³ã³ããŒãã³ããçµã¿åãããŠæ§ç¯ãããã³ã³ããŒãã³ãã§ãã
ããŒãµãŒã³ã³ãããŒã¿ãŒã®èãæ¹ã¯ãœãããŠã§ã¢å·¥åŠã«éåžžã«é©ããŠããã颿°ã®çµã¿åããã«åºã¥ããŠããŒãµãŒãæ§ç¯ããæè¡ã§ããå°ãããåå©çšå¯èœã§ããã¹ãå¯èœãªããŒãµãŒã³ã³ããŒãã³ããçµã¿åãããŠè€éãªããŒãµãŒãæ§ç¯ããããšã§ãããŒãµãŒã®æ§ç¯ãããæè»ã§æ¡åŒµå¯èœã«ããéçºã®å¹çãå€§å¹ ã«åäžãããåŸã®ã¡ã³ããã³ã¹ã容æã«ããŸãã
- ããŒãµãŒã³ã³ãããŒã¿ãŒãšããŒãµãŒãžã§ãã¬ãŒã¿ãŒ
ããŒãµãŒã³ã³ãããŒã¿ãŒã¯ãåè¿°ã®ããŒãµãŒãžã§ãã¬ãŒã¿ãŒãšå¹³è¡ã®æŠå¿µã§ããããã§äŸãæããŸãïŒ
- å®çŸãããããŒãµãŒïŒäŸãã° JSON ããŒãµãŒïŒã倧ããªãã«ãšèŠãªããšããŸãã
- ããŒãµãŒãžã§ãã¬ãŒã¿ãŒã䜿çšããŠæ§ç¯ãããšãã»ãŒæ¯åãŒããããã®ãã«ãæ§ç¯ããããšã«ãªããŸãããã«ãšãã«ã®éã®é¡äŒŒéšåïŒãã¢ãçªãªã©ïŒã¯åå©çšã§ããŸããã
- äžæ¹ãããŒãµãŒã³ã³ãããŒã¿ãŒã䜿çšãããšãã¬ãŽãããã¯ãçµã¿ç«ãŠãããã«å°ãããåå©çšå¯èœã§ããã¹ãå¯èœãªã³ã³ããŒãã³ããæ§ç¯ãããããã®ã³ã³ããŒãã³ãã䜿çšããŠãã«ãæ§ç¯ããŸããæ°ãããã«ãæ§ç¯ããéã«ã¯ã以åã«äœæããã³ã³ããŒãã³ãã䜿çšããããšãã§ããéåžžã«äŸ¿å©ã§ãããŸããããŒãµãŒã«åé¡ãçºçããå Žåãç¹å®ã®ã³ã³ããŒãã³ãã«å®¹æã«ç¹å®ã§ããåŸã®ã¡ã³ããã³ã¹ã䟿å©ã§ãã
- ããŒãµãŒã³ã³ãããŒã¿ãŒãš PEG ã䜿çšããŠå®çŸãããããŒãµãŒãžã§ãã¬ãŒã¿ãŒ
ã衚çŸé¢ãããŒãµãŒã³ã³ãããŒã¿ãŒã¯ã衚çŸèœåãããæè»ã§ãããã°ã©ãã³ã°èšèªã®ç¹æ§ãçŽæ¥äœ¿çšããŠããŒãµãŒãçµã¿åãããŠå®çŸ©ã§ããŸããäžæ¹ãPEG ã䜿çšããŠå®çŸãããããŒãµãŒãžã§ãã¬ãŒã¿ãŒã¯ãç¹å®ã®æ§æã«ãŒã«ã䜿çšããŠããŒãµãŒãèšè¿°ãã衚çŸèœåã¯æ§æã«ãŒã«ã«å¶çŽãããŸããã€ãŸããParser Generator èªäœã®ã€ã³ã¿ãŒãã§ãŒã¹ã䜿çšããæ¹æ³ãåŠã¶å¿ èŠããããPEG ã®æ§æã«ãŒã«ãç¿åŸããå¿ èŠããããŸãã
ãæ§èœé¢ãããŒãµãŒã³ã³ãããŒã¿ãŒãšããŒãµãŒãžã§ãã¬ãŒã¿ãŒã®æ§èœæ¯èŒã¯ãå ·äœçãªå®è£ ã䜿çšã·ãŒã³ã«ãã£ãŠç°ãªããŸããããããäžè¬çã«èšãã°ãããŒãµãŒãžã§ãã¬ãŒã¿ãŒã¯éåžžå¹ççãªããŒãµãŒã³ãŒããçæããŸãããã®ãããå€§èŠæš¡ãªæ§æãè€éãªå ¥åãåŠçããéã«ã¯ãããè¯ãæ§èœãæã€å¯èœæ§ããããŸããäžæ¹ãããŒãµãŒã³ã³ãããŒã¿ãŒã¯éåžžãå®è¡æã«åçã«ããŒãµãŒãçµã¿åããããããäžå®ã®æ§èœãªãŒããŒãããããããŸãã
ãã ããçŸåš Rust ã§ã¯ãããŒãµãŒã³ã³ãããŒã¿ãŒã䜿çšããŠå®çŸããã nom ãšãPEG ã䜿çšããŠå®çŸããã pest ã®éã§ãåè ã®æ¹ãæ§èœãé«ãã§ãã
Rust ããŒãµãŒã©ã€ãã©ãª#
以äžã§ã¯ãRust ã§ããŒãµãŒãå®çŸããããã®å€å žçãªãµãŒãããŒãã£ã©ã€ãã©ãªãPEG ããŒã¹ã® Pest ãšããŒãµãŒã³ã³ãããŒã¿ãŒã® Nomã玹ä»ããŸãã
pest#
æŠèŠ#
Pest ã¯ãRust ã§æžãããæ±çšããŒãµãŒã§ãããå¯çšæ§ãæ£ç¢ºæ§ãæ§èœã«éç¹ã眮ããŠããŸããåè¿°ã®PEG ãå ¥åãšããŠäœ¿çšããè€éãªèšèªãè§£æããããã«å¿ èŠãªåŒ·åããã衚çŸèœåãæäŸããã«ã¹ã¿ã ããŒãµãŒãæ§ç¯ããããã«ç°¡æœã§åªé ãªæ¹æ³ã§å®çŸ©ããã³çæããŸãã
èªåçæãšã©ãŒã¬ããŒããderive 屿§ãéããŠããŒãµãŒãã¬ã€ãã®å®è£ ãèªåçæããæ©èœãåäžãã¡ã€ã«å ã§è€æ°ã®ããŒãµãŒãå®çŸ©ããæ©èœãªã©ãåããŠããŸãã
䜿çšäŸ#
- cargo.toml ã« pest äŸåé¢ä¿ã远å
[dependencies]
pest = "2.6"
pest_derive = "2.6"
- æ°ãã
src/grammar.pest
ãã¡ã€ã«ãäœæããè§£æè¡šçŸã®æ§æãèšè¿°
ããã§ã®æ§æã¯ããã£ãŒã«ãã®è§£æã«ãŒã«ã瀺ããŠãããåæåã ASCII æ°åã§ãããå°æ°ç¹ãšè² å·ãå«ãããšã瀺ããŠããŸãã+
ã¯ãã®ãã¿ãŒã³ãè€æ°ååºçŸããããšã瀺ããŸãã
field = { (ASCII_DIGIT | "." | "-")+ }
- æ°ãã
src/parser.rs
ãã¡ã€ã«ãäœæããããŒãµãŒãå®çŸ©
以äžã®ã³ãŒãã¯ãæ§é äœ Parser ãå®çŸ©ããæŽŸçãã¯ããéããŠãïŒæ¯åã®ã³ã³ãã€ã«æã«ïŒææ³ãã¡ã€ã«ã®ãã¿ãŒã³ãæºããããŒãµãŒãèªåçã«å®è£ ããŸãã
use pest_derive::Parser;
#[derive(Parser)]
#[grammar = "grammer.pest"]
pub struct Parser;
// ãã®ãã¡ã€ã«ãã³ã³ãã€ã«ãããã³ã«ãpestã¯grammarãã¡ã€ã«ã䜿çšããŠãã®ãããªé
ç®ãèªåçæããŸã
#[cfg(test)]
mod test {
use std::fs;
use pest::Parser;
use crate::{Parser, Rule};
#[test]
pub fn test_parse() {
let successful_parse = Parser::parse(Rule::field, "-273.15");
println!("{:?}", successful_parse);
let unsuccessful_parse = Parser::parse(Rule::field, "China");
println!("{:?}", unsuccessful_parse);
}
}
å ·äœçãªäœ¿çš#
ããŒãµãŒ API#
pest ã¯ãæåããè§£æçµæã«ã¢ã¯ã»ã¹ããããã®ããŸããŸãªæ¹æ³ãæäŸããŠããŸãã以äžã®æ§æäŸã«åŸã£ãŠãã®æ¹æ³ã玹ä»ããŸãïŒ
number = { ASCII_DIGIT+ } // 1ã€ä»¥äžã®10鲿°å
enclosed = { "(.." ~ number ~ "..)" } // äŸãã°ã"(..1024..)"
sum = { number ~ " + " ~ number } // äŸãã°ã"1024 + 12"
- ããŒã¯ã³
pest ã¯ãæåã衚ãããŒã¯ã³ã䜿çšããŸããã«ãŒã«ãããããããšããããããéå§äœçœ®ãšçµäºäœçœ®ã衚ã 2 ã€ã®ããŒã¯ã³ãçæãããŸããäŸãã°ïŒ
"3130 abc"
| ^ end(number)
^ start(number)
çŸåšãrustrover ã«ã¯ pest ãã©ãŒãããããµããŒããããã©ã°ã€ã³ããããã«ãŒã«ãæ€èšŒããããããŒã¯ã³ã衚瀺ãããããæ©èœããããŸãã
- ãã¹ããããã«ãŒã«
ããåœåã«ãŒã«ãå¥ã®åœåã«ãŒã«ãå«ãå Žåãäž¡è ã®ããŒã¯ã³ãçæãããŸãã以äžã®ããã«äž¡è ã瀺ããŸãïŒ
"(..6472..)"
| | | ^ end(enclosed)
| | ^ end(number)
| ^ start(number)
^ start(enclosed)
åæã«ãç¹å®ã®ã·ãŒã³ã§ã¯ãããŒã¯ãç°ãªãæåäœçœ®ã«çŸããªãå ŽåããããŸãïŒ
"1773 + 1362"
| | | ^ end(sum)
| | | ^ end(number)
| | ^ start(number)
| ^ end(number)
^ start(number)
^ start(sum)
- ã€ã³ã¿ãŒãã§ãŒã¹
ããŒã¯ã³ã¯ Token enum 圢åŒã§å ¬éããããã® enum 㯠Start ãš End ã®ããªã¢ã³ããæã¡ãè§£æçµæã«å¯Ÿã㊠tokens ãåŒã³åºããŠã€ãã¬ãŒã¿ãååŸã§ããŸãïŒ
let parse_result = DemoParser::parse(Rule::sum, "1773 + 1362").unwrap();
let tokens = parse_result.tokens();
for token in tokens {
println!("{:?}", token);
}
- ãã¢
ãããããããŒã¯ã³å¯Ÿãèæ ®ããŠè§£æããªãŒãæ¢çŽ¢ããå Žåãpest 㯠Pair åãæäŸããŠããã以äžã®ããã«äœ¿çšãããŸãïŒ
- ã©ã®ã«ãŒã«ã Pair ãçæããããç¹å®ãã
- Pair ãçã® & str ãšããŠäœ¿çšãã
- Pair ãçæããå éšåœåã«ãŒã«ã確èªãã
let pair = DemoParser::parse(Rule::enclosed, "(..6472..) and more text")
.unwrap().next().unwrap();
assert_eq!(pair.as_rule(), Rule::enclosed);
assert_eq!(pair.as_str(), "(..6472..)");
let inner_rules = pair.into_inner();
println!("{}", inner_rules); // --> [number(3, 7)]
Pair ã¯ä»»æã®æ°ã®å éšã«ãŒã«ãæã€ããšãã§ããPair::into_inner () ã䜿çšã㊠Pairs ãè¿ããåãã¢ã®ã€ãã¬ãŒã¿ãååŸã§ããŸãïŒ
let pairs = DemoParser::parse(Rule::sum, "1773 + 1362")
.unwrap().next().unwrap()
.into_inner();
let numbers = pairs
.clone()
.map(|pair| str::parse(pair.as_str()).unwrap())
.collect::<Vec<i32>>();
assert_eq!(vec![1773, 1362], numbers);
for (found, expected) in pairs.zip(vec!["1773", "1362"]) {
assert_eq!(Rule::number, found.as_rule());
assert_eq!(expected, found.as_str());
}
- è§£æã¡ãœãã
掟çãã Parser ã¯ãResult<Paris,Error> ãè¿ã parse ã¡ãœãããæäŸããåºå±€ã®è§£æããªãŒã«ã¢ã¯ã»ã¹ããã«ã¯ãçµæã match ãŸã㯠unwrap ããå¿ èŠããããŸãïŒ
// è§£æãæåãããã©ããã確èª
match Parser::parse(Rule::enclosed, "(..6472..)") {
Ok(mut pairs) => {
let enclosed = pairs.next().unwrap();
// ...
}
Err(error) => {
// ...
}
}
è§£æè¡šçŸã®æ§æ#
PEG ã®åºæ¬çãªè«çã¯éåžžã«ã·ã³ãã«ã§çŽæ¥çã§ããã以äžã® 3 ã€ã®ã¹ãããã«èŠçŽã§ããŸãïŒ
- ã«ãŒã«ã®ãããã詊ã¿ã
- æåããå Žåãæ¬¡ã®ã¹ãããã詊ã¿ã
- 倱æããå Žåãå¥ã®ã«ãŒã«ã詊ã¿ã
ãã®æ§æã®ç¹åŸŽã¯ä»¥äžã® 4 ç¹ã§ãïŒ
- 貪欲æ§
å ¥åæååäžã§ç¹°ãè¿ã PEG 衚çŸãå®è¡ãããšã貪欲ã«ïŒã§ããã ãå€ãïŒå®è¡ããããã®çµæã¯ä»¥äžã®ããã«ãªããŸãïŒ
- ããããæåããå Žåããããããå å®¹ãæ¶è²»ããæ®ãã®å ¥åãè§£æåšã®æ¬¡ã®ã¹ãããã«æž¡ããŸãã
- ãããã倱æããå Žåãäœã®æåãæ¶è²»ããããã®å€±æãäŒæããæçµçã«è§£æã倱æããŸãã倱æãäŒæäžã«ææãããªãéãã
// 衚çŸ
ASCII_DIGIT+ // '0'ãã'9'ã®1ã€ä»¥äžã®æå
// ãããããã»ã¹
"42 boxes"
^ Running ASCII_DIGIT+
"42 boxes"
^ Successfully took one or more digits!
" boxes"
^ Remaining unparsed input.
- é åºä»ãéžæ
æ§æã«ã¯é åºä»ãéžææŒç®å|
ãååšããäŸãã°one|two
ã¯ãæåã«åè
one ã詊ã¿ã倱æããå Žåã«åŸè
two ã詊ã¿ãŸãã
é åºãèŠæ±ãããå Žåãã«ãŒã«ã衚çŸå ã®äœçœ®ã«é 眮ããå¿ èŠããããŸããäŸãã°ïŒ
- 衚çŸ
"a"|"ab"
ã§ã¯ãæåå "abc" ããããããéãåã®ã«ãŒã«"a"
ã«ããããããšãåŸã® "bc" ãè§£æããªããªããŸãã
ãã®ãããéžæçãªããŒãµãŒãäœæããéã«ã¯ãæãé·ããŸãã¯å ·äœçãªéžæãåã«çœ®ããæãçããŸãã¯äžè¬çãªéžæãæåŸã«çœ®ãããšãäžè¬çã§ãã
- éããã¯ãã©ããã³ã°
è§£æããã»ã¹äžã衚çŸã¯æåããã倱æãããã®ããããã§ãã
æåããå Žåã¯æ¬¡ã®ã¹ãããã«é²ã¿ã倱æããå Žåã¯è¡šçŸã倱æãããšã³ãžã³ã¯åŸéããŠå詊è¡ããŸãããããã¯ãããã¯ãã©ããã³ã°ãæã€æ£èŠè¡šçŸãšã¯ç°ãªããŸãã
以äžã®äŸãèŠãŠã¿ãŸãããïŒããã§~
ã¯ãåã®ã«ãŒã«ããããããåŸã«å®è¡ãããæ¬¡ã®ã¹ãããã瀺ããŸãïŒïŒ
word = { // åèªãèªèããããã«...
ANY* // ä»»æã®æåã0å以äžååŸ...
~ ANY // ä»»æã®æåã®åŸ
}
"frumious"
æåå "frumious" ããããããéãANY*
ã¯æåã«æååå
šäœãæ¶è²»ããæ¬¡ã®ã¹ãããANY
ã¯äœããããããªããããè§£æã倱æããŸãã
"frumious"
^ (word)
"frumious"
^ (ANY*) Success! Continue to ANY with remaining input "".
""
^ (ANY) Failure! Expected one character, but found end of string.
ãã®ãããªã·ãŒã³ã§ã¯ãããã¯ãã©ããã³ã°æ©èœãæã€ã·ã¹ãã ïŒæ£èŠè¡šçŸãªã©ïŒã§ã¯ã1 æåãåŸéããŠå詊è¡ããŸãã
- æç¢ºæ§
PEG ã®åã«ãŒã«ã¯ãå
¥åæååã®æ®ãã®éšåã§å®è¡ãããã§ããã ãå€ãã®å
¥åãæ¶è²»ããŸããäžåºŠã«ãŒã«ãå®äºãããšãæ®ãã®å
¥åã¯è§£æåšã®ä»ã®éšåã«æž¡ãããŸããäŸãã°ã衚çŸASCII_DIGIT+
㯠1 ã€ä»¥äžã®æ°åããããããããšã瀺ããåžžã«å¯èœãªæå€§ã®é£ç¶æ°åã®ã·ãŒã±ã³ã¹ããããããŸããæå³ããªã圢ã§åŸã®ã«ãŒã«ãããã¯ãã©ãã¯ããããšã¯ãªããçŽæçã§é屿çãªæ¹æ³ã§æ°åãçããããªå±éºãªç¶æ³ã¯ãããŸããã
ããã¯ãä»ã®è§£æããŒã«ïŒæ£èŠè¡šçŸã CFG ãªã©ïŒãšã¯å¯Ÿç §çã§ããããããã®ããŒã«ã§ã¯ãã«ãŒã«ã®çµæã¯ãã°ãã°è·é¢ã®ããã³ãŒãã«äŸåããŸãã
ããŒãµãŒæ§æãšå èµã«ãŒã«#
- éèŠãªæ§æ
pest ã®æ§æã¯æ£èŠè¡šçŸã«æ¯ã¹ãŠå°ãªãã§ããã以äžã«äž»èŠãªæ§æãšãã®æå³ãç°¡åã«ç€ºããŸãã詳现ã«ã€ããŠã¯èªåã§æ€çŽ¢ããŠãã ããïŒ
æ§æ | æå³ | æ§æ | æå³ |
---|---|---|---|
foo = { ... } | éåžžã®ã«ãŒã« | baz = @{ ... } | ååç |
bar = _{ ... } | ãµã€ã¬ã³ã | qux = ${ ... } | è€åååç |
#tag = ... | ã¿ã° | plugh = !{ ... } | éååç |
"abc" | æ£ç¢ºãªæåå | ^"abc" | 倧æåå°æåãåºå¥ããªã |
'a'..'z' | æåç¯å² | ANY | ä»»æã®æå |
foo ~ bar | ã·ãŒã±ã³ã¹ | `baz | qux` |
foo* | 0 åä»¥äž | bar+ | 1 åä»¥äž |
baz? | ãªãã·ã§ã³ | qux{n} | ã¡ããã© n å |
qux{m, n} | m åãã n åïŒå«ãïŒ | ||
&foo | è¯å®çè¿°èª | ||
PUSH(baz) | ãããããŠããã·ã¥ | !bar | åŠå®çè¿°èª |
POP | ãããããŠããã | ||
DROP | ãããããã«ããã | PEEK | ãããããã«ããã |
PEEK_ALL | ã¹ã¿ãã¯å šäœãããã |
- å èµã«ãŒã«
ANY
ã®ä»ã«ãpest ã¯éåžžã«å€ãã®å
èµã«ãŒã«ãæäŸããããã¹ãã®è§£æããã䟿å©ã«ããŸããããã§ã¯äž»ã«ããã€ãã®äžè¬çãªã«ãŒã«ã瀺ããŸãã詳现ã¯èªåã§èª¿ã¹ãŠãã ããïŒ
å èµã«ãŒã« | åçã®æå³ | å èµã«ãŒã« | åçã®æå³ |
---|---|---|---|
ASCII_DIGIT | '0'..'9' | ASCII_ALPHANUMERIC | æ°åãŸãã¯æåã®ãããã `ASCII_DIGIT |
UPPERCASE_LETTER | 倧æå | NEWLINE | ä»»æã®æ¹è¡åœ¢åŒ `"\n" |
LOWERCASE_LETTER | å°æå | SPACE_SEPARATOR | 空çœåºåã |
MATH_SYMBOL | æ°åŠèšå· | EMOJI | Emoji 衚æ |
nom#
æŠèŠ#
nom ã¯ãåè¿°ã®ããŒãµãŒã³ã³ãããŒã¿ãŒïŒParser CombinatorïŒã©ã€ãã©ãªã§ãRust ã§æžãããŠããŸãã以äžã®ç¹åŸŽããããŸãïŒ
- ã¹ããŒããã¡ã¢ãªæ¶è²»ã«åœ±é¿ãäžããã«å®å šãªããŒãµãŒãæ§ç¯ããŸãã
- Rust ã®åŒ·åãªåã·ã¹ãã ãšã¡ã¢ãªå®å šæ§ã掻çšããŠãæ£ç¢ºã§å¹ççãªããŒãµãŒãçæããŸãã
- 颿°ããã¯ããç¹åŸŽãæäŸããŠããšã©ãŒãçºçãããããã€ãã©ã€ã³ã®å€§éšåãæœè±¡åããåæã«ããŒãµãŒãç°¡åã«çµã¿åãããŠåå©çšããŠè€éãªããŒãµãŒãæ§ç¯ã§ããŸãã
nom ã¯ãéåžžã«åºç¯ãªã¢ããªã±ãŒã·ã§ã³ã·ãŒã³ããµããŒãããŠããã以äžã®äžè¬çãªã·ãŒã³ãå«ãŸããŸãïŒ
- ãã€ããªãã©ãŒãããããŒãµãŒïŒnom ã®æ§èœã¯ãC èšèªã§ææžããããããŒãµãŒãšåããããéãããããã¡ãªãŒããŒãããŒã®è匱æ§ã®åœ±é¿ãåãããäžè¬çãªåŠçãã¿ãŒã³ãçµã¿èŸŒãŸããŠããŸãã
- ããã¹ããã©ãŒãããããŒãµãŒïŒCSV ãããè€éãªãã¹ãããããã©ãŒããã JSON ãªã©ãåŠçã§ããããŒã¿ã管çã§ããã ãã§ãªããè€æ°ã®äŸ¿å©ãªããŒã«ãçµã¿èŸŒãŸããŠããŸãã
- ããã°ã©ãã³ã°èšèªããŒãµãŒïŒnom ã¯èšèªã®ãããã¿ã€ãããŒãµãŒãšããŠæ©èœããã«ã¹ã¿ã ãšã©ãŒã¿ã€ããšã¬ããŒãããµããŒããã空çœãèªåçã«åŠçããAST ããã®å Žã§æ§ç¯ããŸãã
- äžèšã®ã·ãŒã³ã®ä»ã«ããã¹ããªãŒãã³ã°ãã©ãŒãããïŒHTTP ãããã¯ãŒã¯åŠçãªã©ïŒããœãããŠã§ã¢å·¥åŠã«ããé©ããããŒãµãŒã³ã³ãããŒã¿ãŒãªã©ããããŸãã
䜿çšäŸ#
ããã§ã¯ãnom ãªããžããªã® README ã«æäŸãããŠããã16 鲿°ã«ã©ãŒè§£æåšãã®äŸã玹ä»ããŸãïŒ
ããã§èª¬æãã 16 鲿°ã«ã©ãŒã®å ·äœçãªãã©ãŒãããã¯ïŒ
- "#" ã§å§ãŸãããã®åŸã« 6 æåãç¶ããå 2 æåãèµ€ãç·ãéã® 3 ã€ã®è²ãã£ãã«ã®å€ã衚ããŸãã
äŸãã°ã"#2F14DF" ã§ã¯ã"2F" ãèµ€è²ãã£ãã«ã®å€ã"14" ãç·è²ãã£ãã«ã®å€ã"DF" ãéè²ãã£ãã«ã®å€ã衚ããŸãã
- cargo.toml ã« nom äŸåé¢ä¿ã远å
[dependencies]
nom = "7.1.3"
- æ°ãã
src/nom/hex_color.rs
ãäœæããnom ãã€ã³ããŒãã㊠16 鲿°ã«ã©ãŒã®è§£æã¡ãœããhex_color
ãæ§ç¯
tag
ã¯å é ã®æåãã¿ãŒã³ããããããtag("#")
ã¯é¢æ°ãè¿ãããã®æ»ãå€ã¯IResult<Input,Input,Error>
ã§ãã- ããã§
Input
ã¯é¢æ°ã®å ¥åãã©ã¡ãŒã¿ã¿ã€ãã§ãããæåã®å€ã¯ããããã¿ãŒã³ãé€ããå ¥åå€ã2 çªç®ã¯ãããå 容ãæåŸã¯ãšã©ãŒå€ã§ãã
- ããã§
- nom ãæäŸãã
take_while_m_n
ã¡ãœããã¯ãæå°ãšæå€§ã®ãããæ°ãåã® 2 ã€ã®ãã©ã¡ãŒã¿ãšããŠåãåããæåŸã®ãã©ã¡ãŒã¿ã¯ãããã«ãŒã«ã§ãããäžèšãšäŒŒããããªæ»ãå€ãè¿ããŸãã - nom ãæäŸãã
map_res
ã¡ãœããã¯ãæåã®ãã©ã¡ãŒã¿ããåŸãããçµæãã2 çªç®ã®ãã©ã¡ãŒã¿ã®ãã¿ãŒã³ã«åŸã£ãŠå€æã§ããŸãã - nom ãæäŸãã
tuple
ã¡ãœããã¯ãçµã¿åããåã®ã°ã«ãŒããåãåãããããã®çµã¿åããåãé çªã«å ¥åã«é©çšããé çªã«è§£æçµæãã¿ãã«åœ¢åŒã§è¿ããŸãã
use nom::{AsChar, IResult};
use nom::bytes::complete::tag;
use nom::bytes::complete::take_while_m_n;
use nom::combinator::map_res;
use nom::sequence::tuple;
#[derive(Debug, PartialEq)]
pub struct Color {
pub red: u8,
pub green: u8,
pub blue: u8,
}
// 16鲿°ã®æ°åãã©ãã
pub fn is_hex_digit(c: char) -> bool {
c.is_hex_digit()
}
// æååã10鲿°ã®çµæã«å€æ
pub fn to_num(input: &str) -> Result<u8, std::num::ParseIntError> {
u8::from_str_radix(input, 16)
}
// is_hex_digitã«ãŒã«ã«åŸã£ãŠå
¥åã2æåããšã«ãããããçµæãto_hex_numã§10鲿°ã«å€æ
pub fn hex_primary(input: &str) -> IResult<&str, u8> {
map_res(
take_while_m_n(2, 2, is_hex_digit),
to_num,
)(input)
}
// 16鲿°ã«ã©ãŒã®ããŒãµãŒ
pub fn hex_color(input: &str) -> IResult<&str, Color> {
let (input, _) = tag("#")(input)?;
let (input, (red, green, blue)) = tuple((hex_primary, hex_primary, hex_primary))(input)?;
Ok((input, Color { red, green, blue }))
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_hex_color() {
assert_eq!(hex_color("#2F14DF"), Ok(("", Color {
red: 47,
green: 20,
blue: 223,
})))
}
}
å ·äœçãªäœ¿çš#
ããŒãµãŒçµæ#
åè¿°ã®äŸã§èŠã nom è§£æã¡ãœããã®æ»ãå€IResult
ã¯ãnom ã®ã³ã¢æ§é ã® 1 ã€ã§ãããnom è§£æã®æ»ãçµæã衚ããŸãã
ãŸããnom ãæ§ç¯ããããŒãµãŒã¯ãè§£æãããçµæã以äžã®ããã«å®çŸ©ããŸãïŒ
Ok(...)
ã¯è§£æãæåããåŸã«èŠã€ãã£ãå 容ã瀺ããErr(...)
ã¯è§£æã察å¿ããå 容ãèŠã€ããããªãã£ãããšã瀺ããŸãã- è§£æãæåããå Žåãæ»ãå€ã¯ã¿ãã«ã§ãæåã®å€ã¯è§£æåšããããããªãã£ããã¹ãŠã®å 容ãå«ã¿ã2 çªç®ã®å€ã¯è§£æåšãããããããã¹ãŠã®å 容ãå«ã¿ãŸãã
- è§£æã倱æããå Žåãè€æ°ã®ãšã©ãŒãè¿ãããå¯èœæ§ããããŸãã
ââ⺠Ok(
â what the parser didn't touch,
â what matched the regex
â )
âââââââââââ â
my inputââââºâmy parserââââºeitherâââ€
âââââââââââ ââ⺠Err(...)
ãããã£ãŠããã®ã¢ãã«ã衚ãããã«ãnom ã¯æ§é äœIResult<Input,Output,Error>
ãå®çŸ©ããŠããŸãïŒ
- å®éã«ã¯ Input ãš Output ã¯ç°ãªãã¿ã€ããšããŠå®çŸ©ã§ããError 㯠ParseError ãã¬ã€ããå®è£ ããä»»æã®ã¿ã€ãã§ãã
ã¿ã°ãšæåã¯ã©ã¹#
- ã¿ã°ãã€ãéåã¿ã°
nom ã¯ãåçŽãªãã€ãéåãã¿ã°ãšåŒã³ãŸãããããã¯éåžžã«äžè¬çã§ãããããtag()
颿°ãå
èµãããŠãããæå®ãããæååã®ããŒãµãŒãè¿ããŸãã
äŸãã°ãæåå "abc" ãè§£æãããå Žåãtag("abc")
ã䜿çšã§ããŸãã
泚æãå¿ èŠãªã®ã¯ãnom ã«ã¯ç°ãªãã¿ã°å®çŸ©ãè€æ°ååšããç¹ã«èª¬æããªãéããéåžžã¯ä»¥äžã®å®çŸ©ã䜿çšããŠãæå³ããªããšã©ãŒãé¿ããã¹ãã§ãïŒ
pub use nom::bytes::complete::tag;
tag
颿°ã®ã·ã°ããã£ã¯ä»¥äžã®ããã«ãªããŸããtag
ã¯é¢æ°ãè¿ãããã®é¢æ°ã¯ããŒãµãŒã§ããã&str
ãååŸããIResult
ãè¿ããŸãïŒ
pub fn tag<T, Input, Error: ParseError<Input>>(
tag: T
) -> impl Fn(Input) -> IResult<Input, Input, Error> where
Input: InputTake + Compare<T>,
T: InputLength + Clone,
以äžã¯ãtag
ã䜿çšãã颿°ã®å®è£
äŸã§ãïŒ
use nom::bytes::complete::tag;
use nom::IResult;
pub fn parse_input(input: &str) -> IResult<&str, &str> {
tag("abc")(input)
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_parse_input() {
let (leftover_input, output) = parse_input("abcWorld!").unwrap();
assert_eq!(leftover_input, "World!");
assert_eq!(output, "abc");
assert!(parse_input("defWorld").is_err());
}
}
- æåã¯ã©ã¹
ã¿ã°ã¯å é ã®ã·ãŒã±ã³ã¹ã®æåã«ã®ã¿äœ¿çšã§ãããããnom ã¯äºåã«æžãããè§£æåšãããªãã¡æåã¯ã©ã¹ãšåŒã°ãããã®ãæäŸããŠããŸããããã«ãããä»»æã®æåã®ã»ããã®ãããããåãå ¥ããããšãã§ããŸãã以äžã¯ããã䜿çšãããå èµè§£æåšã®ããã€ãã瀺ããŸãïŒ
è§£æåš | äœçš | è§£æåš | äœçš |
---|---|---|---|
alpha0/alpha1 | 0 åãŸãã¯è€æ°ã®å°æåããã³å€§æåã®æåãèªèãåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ± | multispace0/multispace1 | 0 åãŸãã¯è€æ°ã®ç©ºçœãã¿ãããã£ãªããžãªã¿ãŒã³ãæ¹è¡ãèªèãåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ± |
alphanumeric0/alphanumeric1 | 0 åãŸãã¯è€æ°ã®æ°åãŸãã¯æåãèªèãåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ± | space0/space1 | 0 åãŸãã¯è€æ°ã®ç©ºçœããã³ã¿ããèªèãåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ± |
digit0/digit1 | 0 åãŸãã¯è€æ°ã®æ°åãèªèãåŸè ã¯å°ãªããšã 1 æåãè¿ãããšãèŠæ± | newline | æ¹è¡ãèªè |
以äžã¯ãã©ã®ããã«äœ¿çšãããã瀺ãç°¡åãªäŸã§ãïŒ
use nom::character::complete::alpha0;
use nom::IResult;
fn parse_alpha(input: &str) -> IResult<&str, &str> {
alpha0(input)
}
#[test]
fn test_parse_alpha() {
let (remaining, letters) = parse_alpha("abc123").unwrap();
assert_eq!(remaining, "123");
assert_eq!(letters, "abc");
}
éžæè¢ãšæ§æ#
- éžæè¢
nom ã¯ãalt()
ã³ã³ãããŒã¿ãŒãæäŸããŠãè€æ°ã®ããŒãµãŒã®éžæãæºãããŸããããã¯ã¿ãã«å
ã®åè§£æåšãå®è¡ããæåããè§£æåšãèŠã€ãããŸã§ç¶ããŸãã
ã¿ãã«å ã®ãã¹ãŠã®è§£æåšã倱æããå Žåã«ã®ã¿ããšã©ãŒãè¿ãããŸãã
以äžã¯ã説æã®ããã®ç°¡åãªäŸã§ãïŒ
use nom::branch::alt;
use nom::bytes::complete::tag;
use nom::IResult;
fn parse_abc_or_def(input: &str) -> IResult<&str, &str> {
alt((
tag("abc"),
tag("def"),
))(input)
}
#[test]
fn test_parse_abc_or_def() {
let (leftover_input, output) = parse_abc_or_def("abcWorld").unwrap();
assert_eq!(leftover_input, "World");
assert_eq!(output, "abc");
let (_, output) = parse_abc_or_def("defWorld").unwrap();
assert_eq!(output, "def");
assert!(parse_abc_or_def("ghiWorld").is_err());
}
- æ§æ
è€æ°ã®è§£æåšã®éžæã«å ããŠãè§£æåšãçµã¿åãããããšãéåžžã«äžè¬çãªèŠæ±ã§ãããããnom ã¯å èµã®ã³ã³ãããŒã¿ãŒãæäŸããŠããŸãã
äŸãã°ãtuple()
ã¯è§£æåšã®ã¿ãã«ãåãåããæåããå Žåã¯Ok
ãšãã¹ãŠã®æåããè§£æã®ã¿ãã«ãè¿ããæåã®å€±æããErr
è§£æåšãè¿ããŸãã
use nom::branch::alt;
use nom::bytes::complete::tag_no_case;
use nom::IResult;
use nom::sequence::tuple;
fn parse_base(input: &str) -> IResult<&str, &str> {
alt((
tag_no_case("a"), // ã¿ã°ã¯å€§æåå°æåãåºå¥ããªã
tag_no_case("t"),
tag_no_case("c"),
tag_no_case("g"),
))(input)
}
fn parse_pair(input: &str) -> IResult<&str, (&str, &str)> {
tuple((
parse_base, parse_base
))(input)
}
#[test]
fn test_parse_pair() {
let (remaining, parsed) = parse_pair("aTcG").unwrap();
assert_eq!(parsed, ("a", "T"));
assert_eq!(remaining, "cG");
assert!(parse_pair("Dct").is_err());
}
äžèšã§èšåããããã«ãå®éã«ã¯ Rust ã¯ä»¥äžã®ãããªé¡äŒŒã®æäœãæã€è§£æåšããµããŒãããŠããŸãã
ã³ã³ãããŒã¿ãŒ | äœ¿çšæ³ | å ¥å | åºå |
---|---|---|---|
delimited | delimited(char('('), take(2), char(')')) | "(ab)cd" | Ok(("cd", "ab")) |
preceded | preceded(tag("ab"), tag("XY")) | "abXYZ" | Ok(("Z", "XY")) |
terminated | terminated(tag("ab"), tag("XY")) | "abXYZ" | Ok(("Z", "ab")) |
pair | pair(tag("ab"), tag("XY")) | "abXYZ" | Ok(("Z", ("ab", "XY"))) |
separated_pair | separated_pair(tag("hello"), char(','), tag("world")) | "hello,world!" | Ok(("!", ("hello", "world"))) |
ã«ã¹ã¿ã æ»ãå€ãæã€ããŒãµãŒ#
IResult
ã® Input ãš Output ã¯å®éã«ã¯ç°ãªãã¿ã€ããšããŠå®çŸ©ã§ãããããã¿ã°ã®çµæãç¹å®ã®å€ã«å€æãããå Žåãnom ãæäŸãã **value
ã³ã³ãããŒã¿ãŒã䜿çšããŠãè§£æãæåããçµæãç¹å®ã®å€ã«å€æã§ããŸã **ã以äžã¯ãã®äœ¿çšäŸã§ãïŒ
use nom::branch::alt;
use nom::bytes::complete::tag;
use nom::combinator::value;
use nom::IResult;
fn parse_bool(input: &str) -> IResult<&str, bool> {
alt((
value(true, tag("true")), // boolåã«å€æ
value(false, tag("false")),
))(input)
}
#[test]
fn test_parse_bool() {
let (remaining, parsed) = parse_bool("true|false").unwrap();
assert_eq!(parsed, true);
assert_eq!(remaining, "|false");
assert!(parse_bool(remaining).is_err());
}
ç¹°ãè¿ãè¿°èªãšããŒãµãŒ#
- è¿°èªã«ããç¹°ãè¿ã
ããã§ã®è¿°èªã¯ãç¹å®ã®æ¡ä»¶ãæºããããã«ç¹°ãè¿ãè§£æåšãåŠçããæ©èœãæºããããã«ãnom ã¯ããã€ãã®ç°ãªãã«ããŽãªã®è¿°èªè§£æåšãæäŸããŠããŸããäž»ã«take_till
ãtake_until
ãtake_while
ã® 3 ã€ã®ã«ããŽãªããããŸãïŒ
ã³ã³ãããŒã¿ãŒ | äœçš | äœ¿çšæ³ | å ¥å | åºå |
---|---|---|---|---|
take_till | å ¥åãè¿°èªãæºãããŸã§ç¶ç¶çã«æ¶è²»ãã | take_while(is_alphabetic) | "abc123" | Ok(("123", "abc")) |
take_while | å ¥åãè¿°èªãæºãããªããŸã§ç¶ç¶çã«æ¶è²»ãã | take_till(is_alphabetic) | "123abc" | Ok(("abc", "123")) |
take_until | æåã«è¿°èªãçŸãããŸã§æ¶è²»ãã | take_until("world") | "Hello World" | Ok(("World", "Hello ")) |
ããã§è£è¶³ãšããŠãäžèšã®ã³ã³ãããŒã¿ãŒã«ã¯ãååããååšããååã®æ«å°Ÿã«1
ãä»ããŠããŸããããã¯ãå°ãªããšã 1 ã€ã®ãããæåãè¿ãå¿
èŠãããããšã瀺ããããã§ãªãå Žåã¯ãšã©ãŒãçºçããŸãã
åè¿°ã®take_while_m_n
ã¯ãå®éã«ã¯take_while
ã«äŒŒãŠãããç¹å®ã®[m,n]
ãã€ããæ¶è²»ããããšãä¿èšŒããŸãã
- ç¹°ãè¿ãããŒãµãŒ
åäžã®è§£æåšã®ç¹°ãè¿ãã«å ããŠãnom ã¯ç¹°ãè¿ãããŒãµãŒã®ã³ã³ãããŒã¿ãŒãæäŸããŠããŸããäŸãã°ãmany0
ã¯ã§ããã ãå€ãã®åæ°è§£æåšãé©çšãããããã®è§£æçµæã®ãã¯ã¿ãŒãè¿ããŸãã以äžã¯ãã®äœ¿çšäŸã§ãïŒ
use nom::bytes::complete::tag;
use nom::IResult;
use nom::multi::many0;
fn repeat_parser(s: &str) -> IResult<&str, Vec<&str>> {
many0(tag("abc"))(s)
}
#[test]
fn test_repeat_parser() {
assert_eq!(repeat_parser("abcabc"), Ok(("", vec!["abc", "abc"])));
assert_eq!(repeat_parser("abc123"), Ok(("123", vec!["abc"])));
assert_eq!(repeat_parser("123123"), Ok(("123123", vec![])));
assert_eq!(repeat_parser(""), Ok(("", vec![])));
}
以äžã«ãäžè¬çã«äœ¿çšãããã³ã³ãããŒã¿ãŒãããã€ã瀺ããŸãïŒ
ã³ã³ãããŒã¿ãŒ | äœ¿çšæ³ | å ¥å | åºå |
---|---|---|---|
count | count(take(2), 3) | "abcdefgh" | Ok(("gh", vec!["ab", "cd", "ef"])) |
many0 | many0(tag("ab")) | "abababc" | Ok(("c", vec!["ab", "ab", "ab"])) |
many_m_n | many_m_n(1, 3, tag("ab")) | "ababc" | Ok(("c", vec!["ab", "ab"])) |
many_till | many_till(tag( "ab" ), tag( "ef" )) | "ababefg" | Ok(("g", (vec!["ab", "ab"], "ef"))) |
separated_list0 | separated_list0(tag(","), tag("ab")) | "ab,ab,ab." | Ok((".", vec!["ab", "ab", "ab"])) |
fold_many0 | fold_many0(be_u8, || 0, |acc, item| acc + item) | [1, 2, 3] | Ok(([], 6)) |
fold_many_m_n | fold_many_m_n(1, 2, be_u8, || 0, |acc, item| acc + item) | [1, 2, 3] | Ok(([3], 3)) |
length_count | length_count(number, tag("ab")) | "2ababab" | Ok(("ab", vec!["ab", "ab"])) |
ãšã©ãŒãããžã¡ã³ã#
nom ã®ãšã©ãŒã¯ãããŸããŸãªããŒãºãèæ ®ããŠèšèšãããŠããŸãïŒ
- ã©ã®è§£æåšã倱æããããå ¥åããŒã¿å ã®äœçœ®ã瀺ã
- ãšã©ãŒãè§£æåšãã§ãŒã³ãäžã«äŒæããéã«ãããå€ãã®ã³ã³ããã¹ããèç©ãã
- éåžžãè§£æåšãåŒã³åºãéã«ãšã©ãŒãç Žæ£ãããããéåžžã«äœããªãŒããŒããã
- ãŠãŒã¶ãŒã®ããŒãºã«å¿ããŠå€æŽå¯èœã§ãç¹å®ã®èšèªã§ã¯ããå€ãã®æ å ±ãå¿ èŠã§ã
ãããã®ããŒãºãæºããããã«ãnom è§£æåšã®çµæã¿ã€ãã¯ä»¥äžã®ããã«èšèšãããŠããŸãïŒ
pub type IResult<I, O, E=nom::error::Error<I>> = Result<(I, O), nom::Err<E>>;
pub enum Err<E> {
Incomplete(Needed), // è§£æåšã決å®ãäžãã®ã«ååãªããŒã¿ããªãããšã瀺ããéåžžãã¹ããªãŒãã³ã°ã·ãŒã³ã§ééããŸãã
Error(E), // éåžžã®è§£æåšãšã©ãŒãäŸãã°ãaltã³ã³ãããŒã¿ãŒã®åè§£æåšãErrorãè¿ããšãä»ã®åè§£æåšã詊ã¿ãŸãã
Failure(E), // å埩äžå¯èœãªãšã©ãŒãäŸãã°ãåè§£æåšãFailureãè¿ããšãaltã³ã³ãããŒã¿ãŒã¯ä»ã®åå²ã詊ã¿ãŸããã
}
nom::Err<E>
ã®äžã§äžè¬çãªãšã©ãŒã¿ã€ã
-
ããã©ã«ãã®ãšã©ãŒã¿ã€ã
nom::error::Error
ã¯ãå ·äœçã«ã©ã®è§£æåšã®ãšã©ãŒã§ãããããšã©ãŒã®å ¥åäœçœ®ãè¿ããŸãã#[derive(Debug, PartialEq)] pub struct Error<I> { /// å ¥åããŒã¿å ã®ãšã©ãŒã®äœçœ® pub input: I, /// nomãšã©ãŒã³ãŒã pub code: ErrorKind, }
- ãã®ãšã©ãŒã¿ã€ãã¯é床ãéãããªãŒããŒããããäœããããç¹°ãè¿ãåŒã³åºãããè§£æåšã«é©ããŠããŸãããæ©èœã¯éãããŠããŸããäŸãã°ãåŒã³åºããã§ãŒã³æ å ±ã¯è¿ãããŸããã
-
ããå€ãã®æ å ±ãååŸããããã«
nom::error::VerboseError
ã䜿çšãããšããšã©ãŒãçºçããè§£æåšãã§ãŒã³ã®ããå€ãã®æ å ±ïŒè§£æåšã®ã¿ã€ããªã©ïŒãè¿ããŸãã#[derive(Clone, Debug, PartialEq)] pub struct VerboseError<I> { /// `VerboseError`ã«ãã£ãŠèç©ããããšã©ãŒã®ãªã¹ãã圱é¿ãåããå ¥åããŒã¿ã®éšåãšããã€ãã®ã³ã³ããã¹ããå«ã pub errors: crate::lib::std::vec::Vec<(I, VerboseErrorKind)>, } #[derive(Clone, Debug, PartialEq)] /// `VerboseError`ã®ãšã©ãŒã³ã³ããã¹ã pub enum VerboseErrorKind { /// `context`颿°ã«ãã£ãŠè¿œå ãããéçæåå Context(&'static str), /// `char`颿°ã«ãã£ãŠæåŸ ãããæåã瀺ã Char(char), /// æ§ã ãªnomããŒãµãŒã«ãã£ãŠäžãããããšã©ãŒã®çš®é¡ Nom(ErrorKind), }
- å
ã®å
¥åãšãšã©ãŒã®ãã§ãŒã³ã確èªããããšã§ããããŠãŒã¶ãŒãã¬ã³ããªãŒãªãšã©ãŒã¡ãã»ãŒãžãæ§ç¯ã§ããŸãã
nom::error::convert_error
颿°ã䜿çšãããšããã®ãããªã¡ãã»ãŒãžãæ§ç¯ã§ããŸãã
- å
ã®å
¥åãšãšã©ãŒã®ãã§ãŒã³ã確èªããããšã§ããããŠãŒã¶ãŒãã¬ã³ããªãŒãªãšã©ãŒã¡ãã»ãŒãžãæ§ç¯ã§ããŸãã
- ParseError ãã¬ã€ãã«ããã«ã¹ã¿ã ãšã©ãŒã¿ã€ã
ParseError<I>
ãã¬ã€ããå®è£
ããããšã§ãç¬èªã®ãšã©ãŒã¿ã€ããå®çŸ©ã§ããŸãã
ãã¹ãŠã® nom ã³ã³ãããŒã¿ãŒã¯ãã®ãšã©ãŒã«å¯ŸããŠäžè¬çã§ãããããè§£æåšçµæã¿ã€ãã§ãããå®çŸ©ããã ãã§ãã©ãã§ã䜿çšãããŸãã
pub trait ParseError<I>: Sized {
// å
¥åäœçœ®ãšErrorKindåæåã«åºã¥ããŠãã©ã®è§£æåšã§ãšã©ãŒãçºçãããã瀺ããŸãã
fn from_error_kind(input: I, kind: ErrorKind) -> Self;
// è§£æåšããªãŒãããã¯ãã©ãã¯ããéã«ããšã©ãŒãäœæããããšãèš±å¯ããŸãïŒããŸããŸãªã³ã³ãããŒã¿ãŒãããå€ãã®ã³ã³ããã¹ãã远å ããŸãïŒã
fn append(input: I, kind: ErrorKind, other: Self) -> Self;
// æåŸ
ãããæåã瀺ããšã©ãŒãäœæããŸãã
fn from_char(input: I, _: char) -> Self {
Self::from_error_kind(input, ErrorKind::Char)
}
// altã®ãããªã³ã³ãããŒã¿ãŒã§ãããŸããŸãªåå²ããã®ãšã©ãŒã®éã§éžæïŒãŸãã¯ããããèç©ïŒãèš±å¯ããŸãã
fn or(self, other: Self) -> Self {
other
}
}
ContextError
ãã¬ã€ããå®è£
ããããšã§ãVerboseError<I>
ã䜿çšããcontext()
ã³ã³ãããŒã¿ãŒããµããŒãã§ããŸãã
以äžã¯ãç°¡åãªäŸãéããŠãã®äœ¿çšæ³ã玹ä»ããŸããããã§ã¯ããããã°ãšã©ãŒã¿ã€ããå®çŸ©ãããšã©ãŒãçæãããã³ã«è¿œå æ å ±ãå°å·ããŸãïŒ
use nom::error::{ContextError, ErrorKind, ParseError};
#[derive(Debug)]
struct DebugError {
message: String,
}
impl ParseError<&str> for DebugError {
// å
·äœçãªãšã©ãŒã®è§£æåšã¿ã€ããå°å·ããŸãã
fn from_error_kind(input: &str, kind: ErrorKind) -> Self {
let message = format!("ã{:?}ã:\t{:?}\n", kind, input);
println!("{}", message);
DebugError { message }
}
// ä»ã®ã³ã³ããã¹ãæ
å ±ãå°å·ããŸãã
fn append(input: &str, kind: ErrorKind, other: Self) -> Self {
let message = format!("ã{}{:?}ã:\t{:?}\n", other.message, kind, input);
println!("{}", message);
DebugError { message }
}
// æåŸ
ãããå
·äœçãªæåãå°å·ããŸãã
fn from_char(input: &str, c: char) -> Self {
let message = format!("ã{}ã:\t{:?}\n", c, input);
print!("{}", message);
DebugError { message }
}
fn or(self, other: Self) -> Self {
let message = format!("{}\tOR\n{}\n", self.message, other.message);
println!("{}", message);
DebugError { message }
}
}
impl ContextError<&str> for DebugError {
fn add_context(_input: &str, _ctx: &'static str, other: Self) -> Self {
let message = format!("ã{}ã{}ãã:\t{:?}\n", other.message, _ctx, _input);
print!("{}", message);
DebugError { message }
}
}
- ãããã°ããŒãµãŒ
ããŒãµãŒãäœæããéçšã§ãè§£æåšã®å®è¡ããã»ã¹æ
å ±ã远跡ããå¿
èŠãããå Žåãdbg_dmp
颿°ã䜿çšããŠè§£æåšã®å
¥åãšåºåãå°å·ã§ããŸãïŒ
fn f(i: &[u8]) -> IResult<&[u8], &[u8]> {
dbg_dmp(tag("abcd"), "tag")(i)
}
let a = &b"efghijkl"[..];
// 次ã®ã¡ãã»ãŒãžãå°å·ãããŸãïŒ
// tag: Error(Error(Error { input: [101, 102, 103, 104, 105, 106, 107, 108], code: Tag })) at:
// 00000000 65 66 67 68 69 6a 6b 6c efghijkl
f(a);
ãŸãšã#
ãã®èšäºãéããŠãç§ãã¡ã¯åºæ¬çã«ããŒãµãŒã®åæç¥èïŒããŒãµãŒãPEGãããŒãµãŒã³ã³ãããŒã¿ãŒïŒãšãRust ã§ããŒãµãŒãå®çŸããããã«å¿ èŠãªãµãŒãããŒãã£ã©ã€ãã©ãªïŒpest ãš nomïŒã®äœ¿ç𿹿³ãçè§£ããŸãããPEG ã䜿çšããŠå®çŸããã pest ã§ããããŒãµãŒã³ã³ãããŒã¿ãŒã䜿çšããŠå®çŸããã nom ã§ããã«ã¹ã¿ã ããŒãµãŒã®å®çŸã«å¿ èŠãªäžè¬çãªã·ãŒã³ãæºããããšãã§ãããŒãããäžã®ææžãããŒãµãŒãäœæããå¿ èŠã¯ãããŸãããããã«ãããã«ã¹ã¿ã ããŒãµãŒã®å®çŸã³ã¹ããå€§å¹ ã«åæžãããŸããããè€éãªç¶æ³ïŒæ§èœãå®çŸã³ã¹ããäœ¿çšææžãªã©ã®èŠå ïŒãèæ ®ããå Žåã¯ãå ·äœçãªã·ãŒã³ã«åºã¥ããŠé©åãªãµãŒãããŒãã£ã©ã€ãã©ãªãéžæããå¿ èŠããããŸãã
次åã®èšäºã§ã¯ãpest ãš nom ã䜿çšããŠãããã€ãã®äžè¬çãªããŒãµãŒãå®è£ ããããŒãµãŒã®èŠç¹ãããããçè§£ããŸãã
åèæç®#
https://zhuanlan.zhihu.com/p/427767002
https://zh.wikipedia.org/wiki/%E8%A7%A3%E6%9E%90%E8%A1%A8%E8%BE%BE%E6%96%87%E6%B3%95
https://zhuanlan.zhihu.com/p/355364928
https://ohmyweekly.github.io/notes/2021-01-20-pest-grammars/#
https://pest.rs/book/parser_api.html
https://rustmagazine.github.io/rust_magazine_2021/chapter_4/nom_url.html