这是 ClojureScript 中 re-seq 实现的 bug。
(defn re-seq
"Returns a lazy sequence of successive matches of re in s."
[re s]
(let [match-data (re-find re s)
match-idx (.search s re)
match-str (if (coll? match-data) (first match-data) match-data)
post-idx (+ match-idx (max 1 (count match-str)))
post-match (subs s post-idx)]
(when match-data (lazy-seq (cons match-data (when (<= post-idx (count s)) (re-seq re post-match)))))))
问题出现在它重复地将字符串的余下部分递归到 re-seq
中。这样做意味着 ^[a-f]
将再次匹配这个新的、较短的字符串。
一个解决方案是使你的正则表达式具有粘性。
(js/RegExp. #"^." "y")
这会使后续使用正则表达式时意识到之前的匹配。请注意,你需要小心放置此代码,因为它需要在正确的位置创建,不能是全局的!如果它是全局的,就会遇到像这样的奇怪状态问题
(let [re (js/RegExp. #"^." "y")]
[(re-seq re "cccc")
(re-seq re "abbb")])
;; => [("c" "c") nil]
(我根本无法解释!)
re-seq
的另一种实现可能为你创建这个初始副本
(defn re-seq2
"Returns a lazy sequence of successive matches of re in s."
[re s]
(let [re-seq* (fn re-seq* [re s]
(let [match-data (re-find re s)
match-idx (.search s re)
match-str (if (coll? match-data) (first match-data) match-data)
post-idx (+ match-idx (max 1 (count match-str)))
post-match (subs s post-idx)]
(when match-data (lazy-seq (cons match-data (when (<= post-idx (count s)) (re-seq* re post-match)))))))]
(re-seq* (js/RegExp. re "y") s)))
(let [re #"^."]
[(re-seq2 re "cccc")
(re-seq2 re "abbb")])
;; => [("c") ("a")]