Java中的正则表达式:匹配组直到第一个符号出现

[英]Regex in Java: match groups until first symbol occurrence


My string looks like this:

我的字符串看起来像这样:

"Chitkara DK, Rawat DJY, Talley N. The epidemiology of childhood recurrent abdominal pain in Western countries: a systematic review. Am J Gastroenterol. 2005;100(8):1868-75. DOI."

What I want is to get letters in uppercase (as separate words only) until first dot, to get: DK DJY N. But not other characters after, like J DOI.

我想要的是获得大写字母(仅作为单独的单词)直到第一个点,得到:DK DJY N.但不是其他字符,如J DOI。

Here`s my part of code for Java class Pattern:

这是我的Java类Pattern的代码部分:

\\b[A-Z]{1,3}\\b

Is there a general option in regex to stop matching after certain character?

正则表达式中是否有一般选项可以在某些字符后停止匹配?

2 个解决方案

#1


5  

You can make use of the contionous matching using \G and extract your desired matches from the first capturing group:

您可以使用\ G进行contionous匹配,并从第一个捕获组中提取您想要的匹配:

(?:\\G|^)[^.]+?\\b([A-Z]{1,3})\\b

You need to use the MULTILINE flag to use this in a multiline context. If your content is always a single line you may drop the |^ from your pattern.

您需要使用MULTILINE标志在多行上下文中使用它。如果您的内容始终是一行,则可以从模式中删除| ^。

See https://regex101.com/r/JXIu21/3

Note that regex101 uses a PCRE pattern, but all features used are also available in Java regex.

请注意,regex101使用PCRE模式,但所有使用的功能也可用于Java正则表达式。

#2


2  

Sebastian Proske's answer is great, but it's often easier (and more readable) to split complex parsing tasks into separate steps. We can split your goal into two separate steps and thereby create a much simpler and more clearly-correct solution, using your original pattern.

Sebastian Proske的答案很棒,但将复杂的解析任务分成不同的步骤往往更容易(也更易读)。我们可以将您的目标分成两个单独的步骤,从而使用您的原始模式创建一个更简单,更清晰正确的解决方案。

private static final Pattern UPPER_CASE_ABBV_PATTERN = Pattern.compile("\\b[A-Z]{1,3}\\b");

public static List<String> getAbbreviationsInFirstSentence(String input) {
  // isolate the first sentence, since that's all we care about
  String firstSentence = input.split("\\.")[0];
  // then look for matches in the first sentence
  Matcher m = UPPER_CASE_ABBV_PATTERN.matcher(firstSentence);
  List<String> results = new ArrayList<>();
  while (m.find()) {
    results.add(m.group());
  }
  return results;
}

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2017/02/25/70a3d5584766f9489bb61f4d27dce9af.html



 
© 2014-2018 ITdaan.com 粤ICP备14056181号