如果想要檢查一個字串a是否由另一字串b開始, 我們會用a.startsWith(b)
如果想要以字串b為基準對a取子字串, 我們會用a.substring(a.lastIndexOf(b)+b.size)
...
以上種種字串比對, 其實也可以透過Regular Expression來達成(JDK 1.4開始支援)
Regex這個class就是讓你輕輕鬆鬆!?使用Regular Expression
要產生Regex的pattern有幾種方式:
第一種就如同以下的範例, 透過字串的r函式就可以產生Regex的物件, (其實String沒有r函式, 而是透過implicit轉換成WrappedString才找到的, 不知道implicit是什麼的話可以參考這篇)
用三層引號則是要省掉跳脫內層反斜線的麻煩
接下來在比對字串的時候可以直接把比對pattern之後每個對應的群組抽出來,
就像以下範例將日期分成年月日三個群組,
用起來很容易但壞處就是比對失敗就會馬上有錯誤訊息跳出來.
scala> val dateP1 = """(\d\d\d\d)-(\d\d)-(\d\d)""".r dateP1: scala.util.matching.Regex = (\d\d\d\d)-(\d\d)-(\d\d) scala> val dateP1(year, month, day) = "2011-07-15" year: String = 2011 month: String = 07 day: String = 15 scala> val dateP1(year, month, day) = "2011-7-15" scala.MatchError: 2011-7-15 (of class java.lang.String) at .或是你可以考慮用findFirstIn, findFirstMatchIn等等Regex提供的函式來比對,( :11) at . ( ) at . ( :11) at . ( ) at $print( ) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:704) at scala.tools.nsc.interpreter.IMain$Request$$anonfun$14.apply(IMain.scala:920) at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43) at scala.tools.nsc.io.package$$anon$2.run(package.scala:25) at java.lang.Thread.run(Thread.java:680)
就算比對失敗也是拿到None, 讓你可以比較無痛的做接下來的處理.
scala> val copyright: String = dateP1 findFirstIn "Date of this document: 2011-07-15" match { | case Some(dateP1(year, month, day)) => "Copyright "+year | case None => "No copyright" | } copyright: String = Copyright 2011 scala> val copyright: String = dateP1 findFirstIn "Date of this document: 2011-7-15" match { | case Some(dateP1(year, month, day)) => "Copyright "+year | case None => "No copyright" | } copyright: String = No copyright另一種產生Regex的方法就是老老實實的用建構式囉!
ref:
- scala api document: Regex
- regular expression metacharacter syntax
- Java Regular Expression筆記