我将如何在Java Regex中执行此操作?

[英]How would I do this in Java Regex?


Trying to make a regex that grabs all words like lets just say, chicken, that are not in brackets. So like

试图制作一个正则表达式,抓住所有单词,比如说,鸡,不在括号中。所以喜欢

chicken

Would be selected but

会被选中但是

[chicken]

Would not. Does anyone know how to do this?

不会。有谁知道如何做到这一点?

4 个解决方案

#1


7  

String template = "[chicken]";
String pattern = "\\G(?<!\\[)(\\w+)(?!\\])";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(template);

while (m.find()) 
{
     System.out.println(m.group());
}

It uses a combination of negative look-behind and negative look-aheads and boundary matchers.

它使用负面后视和负面前瞻和边界匹配器的组合。

(?<!\\[) //negative look behind
(?!\\])  //negative look ahead
(\\w+)   //capture group for the word
\\G      //is a boundary matcher for marking the end of the previous match 

(please read the following edits for clarification)

(请阅读以下编辑以澄清)

EDIT 1:
If one needs to account for situations like:

编辑1:如果需要考虑以下情况:

"chicken [chicken] chicken [chicken]"

We can replace the regex with:

我们可以用以下代码替换正则表达式:

String regex = "(?<!\\[)\\b(\\w+)\\b(?!\\])";

EDIT 2:
If one also needs to account for situations like:

编辑2:如果还需要考虑以下情况:

"[chicken"
"chicken]"

As in one still wants the "chicken", then you could use:

因为在一个人仍然想要“鸡”,那么你可以使用:

String pattern = "(?<!\\[)?\\b(\\w+)\\b(?!\\])|(?<!\\[)\\b(\\w+)\\b(?!\\])?";

Which essentially accounts for the two cases of having only one bracket on either side. It accomplishes this through the | which acts as an or, and by using ? after the look-ahead/behinds, where ? means 0 or 1 of the previous expression.

这主要解释了两侧只有一个支架的两种情况。它通过|来实现这一点作为一个或,并使用?在前瞻/后面,在哪里?表示前一个表达式的0或1。

#2


2  

I guess you want something like:

我想你想要的东西:

final Pattern UNBRACKETED_WORD_PAT = Pattern.compile("(?<!\\[)\\b\\w+\\b(?!])");

private List<String> findAllUnbracketedWords(final String s) {
    final List<String> ret = new ArrayList<String>();
    final Matcher m = UNBRACKETED_WORD_PAT.matcher(s);
    while (m.find()) {
        ret.add(m.group());
    }
    return Collections.unmodifiableList(ret);
}

#3


0  

Use this:

/(?<![\[\w])\w+(?![\w\]])/

i.e., consecutive word characters with no square bracket or word character before or after.

即,在之前或之后没有方括号或单词字符的连续单词字符。

This needs to check both left and right for both a square bracket and a word character, else for your input of [chicken] it would simply return

这需要检查左右两侧的方括号和单词字符,否则输入[chicken]它只会返回

hicke

#4


0  

Without look around:

没有环顾四周:

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class MatchingTest
{
    private static String x = "pig [cow] chicken bull] [grain";

    public static void main(String[] args)
    {
        Pattern p = Pattern.compile("(\\[?)(\\w+)(\\]?)");
        Matcher m = p.matcher(x);
        while(m.find())
        {
            String firstBracket = m.group(1);
            String word = m.group(2);
            String lastBracket = m.group(3);
            if ("".equals(firstBracket) && "".equals(lastBracket))
            {
                System.out.println(word);
            }
        }
    }
}

Output:

pig
chicken

A bit more verbose, sure, but I find it more readable and easier to understand. Certainly simpler than a huge regular expression trying to handle all possible combinations of brackets.

更确切一点,确实,但我发现它更易读,更容易理解。当然比试图处理所有可能的括号组合的巨大正则表达式更简单。

Note that this won't filter out input like [fence tree grass]; it will indicate that tree is a match. You cannot skip tree in that without a parser. Hopefully, this is not a case you need to handle.

请注意,这不会过滤掉像[fence tree grass]这样的输入;它将表明树是匹配的。没有解析器,你不能跳过树。希望这不是你需要处理的情况。

智能推荐

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2013/07/31/a1e05e83eb74236a17fca563ae80d351.html



 
© 2014-2019 ITdaan.com 粤ICP备14056181号  

赞助商广告