ClojureScript 中 re-seq 实现中的一个 bug
(defn re-seq
"Returns a lazy sequence of successive matches of re in s."
[re s]
(let [match-data (re-find re s)
match-idx (.search s re)
match-str (if (coll? match-data) (first match-data) match-data)
post-idx (+ match-idx (max 1 (count match-str)))
post-match (subs s post-idx)]
(when match-data (lazy-seq (cons match-data (when (<= post-idx (count s)) (re-seq re post-match)))))))
问题出现在它递归地对 re-seq
使用字符串的剩余部分,这样做意味着 ^[a-f]
将会再次匹配到这个新的、更短的字符串。
一种解决方案是使你的正则表达式保持粘性
(js/RegExp. #"^." "y")
这会使后续使用你的正则表达式时知道之前的匹配,请注意,你需要小心地放置这段代码,因为它需要在正确的位置创建,不能是全局的!如果是全局的,你将遇到像这种情况的奇怪状态问题
(let [re (js/RegExp. #"^." "y")]
[(re-seq re "cccc")
(re-seq re "abbb")])
;; => [("c" "c") nil]
(我完全无法解释!)
可能是对 re-seq
的另一种实现方式,这可能会为你创建这个初始克隆
(defn re-seq2
"Returns a lazy sequence of successive matches of re in s."
[re s]
(let [re-seq* (fn re-seq* [re s]
(let [match-data (re-find re s)
match-idx (.search s re)
match-str (if (coll? match-data) (first match-data) match-data)
post-idx (+ match-idx (max 1 (count match-str)))
post-match (subs s post-idx)]
(when match-data (lazy-seq (cons match-data (when (<= post-idx (count s)) (re-seq* re post-match)))))))]
(re-seq* (js/RegExp. re "y") s)))
(let [re #"^."]
[(re-seq2 re "cccc")
(re-seq2 re "abbb")])
;; => [("c") ("a")]