匹配所有的连字符或下划线,而不是在字符串的开头或结尾。

[英]Regex: Match all hyphens or underscores not at the beginning or the end of the string


I am writing some code that needs to convert a string to camel case. However, I want to allow any _ or - at the beginning of the code.

我正在编写一些代码,需要将字符串转换为驼峰情况。但是,我希望在代码的开头允许任何_或-。

I have had success matching up an _ character using the regex here:

我已经成功地使用regex匹配了一个_字符:

^(?!_)(\w+)_(\w+)(?<!_)$

when the inputs are:

当输入:

pro_gamer #matched
#ignored
_proto 
proto_
__proto
proto__
__proto__
#matched as nerd_godess_of, skyrim
nerd_godess_of_skyrim

I recursively apply my method on the first match if it looks like nerd_godess_of.

如果方法看起来像nerd_godess_of,则在第一个匹配上递归地应用它。

I am having troubled adding - matches to the same, I assumed that just adding a - to the mix like this would work:

我在添加-匹配上遇到了麻烦,我假设仅仅添加一个-到这样的混合中就可以:

^(?![_-])(\w+)[_-](\w+)(?<![_-])$

and it matches like this:

它是这样匹配的:

super-mario #matched
eslint-path #matched
eslint-global-path #NOT MATCHED.

I would like to understand why the regex fails to match the last case given that it worked correctly for the _.

我想要理解为什么regex不匹配最后一个case,因为它对_是正确的。

The (almost) full set of test inputs can be found here

(几乎)完整的测试输入可以在这里找到

3 个解决方案

#1


3  

The fact that

这一事实

^(?![_-])(\w+)[_-](\w+)(?<![_-])$

does not match the second hyphen in "eslint-global-path" is because of the anchor ^ which limits the match to be on the first hyphen only. This regex reads, "Match the beginning of the line, not followed by a hyphen or underscore, then match one or more words characters (including underscores), a hyphen or underscore, and then one or more word characters in a capture group. Lastly, do not match a hyphen or underscore at the end of the line."

第二个字符不匹配“eslint-global-path”是因为锚^这限制了比赛的第一个字符。这个regex将读取“匹配行首,而不是连字符或下划线,然后匹配一个或多个单词字符(包括下划线)、一个连字符或下划线,然后是捕获组中的一个或多个单词字符。”最后,不要在行尾匹配连字符或下划线。

The fact that an underscore (but not a hyphen) is a word (\w) character completely messes up the regex. In general, rather than using \w, you might want to use \p{Alpha} or \p{Alnum} (or POSIX [[:alpha:]] or [[:alnum:]]).

下划线(而不是连字符)是一个单词(\w),这一事实完全打乱了正则表达式。一般情况下,您可能希望使用\p{Alpha} or \p{Alnum}(或POSIX [: Alpha:])或[[:Alnum:]]。

Try this.

试试这个。

r = /
    (?<=     # begin a positive lookbehind
      [^_-]  # match a character other than an underscore or hyphen
    )        # end positive lookbehind
    (        # begin capture group 1
      (?:    # begin a non-capture group
        -+   # match one or more hyphens
        |    # or
        _+   # match one or more underscores
      )      # end non-capture group
      [^_-]  # match any character other than an underscore or hyphen
    )        # end capture group 1
    /x       # free-spacing regex definition mode

'_cats_have--nine_lives--'.gsub(r) { |s| s[-1].upcase }
  #=> "_catsHaveNineLives--"

This regex is conventionally written as follows.

这个regex按照惯例是这样写的。

r = /(?<=[^_-])((?:-+|_+)[^_-])/

If all the letters are lower case one could alternatively write

如果所有的字母都是小写字母,你可以选择写

'_cats_have--nine_lives--'.split(/(?<=[^_-])(?:_+|-+)(?=[^_-])/).
  map(&:capitalize).join
  #=> "_catsHaveNineLives--"

where

在哪里

'_cats_have--nine_lives--'.split(/(?<=[^_-])(?:_+|-+)(?=[^_-])/)
  #=> ["_cats", "have", "nine", "lives--"]

(?=[^_-]) is a positive lookahead that requires the characters on which the split is made to be followed by a character other than an underscore or hyphen

(? =(^ _ -))是一个积极的超前,需要的字符分割是紧随其后的是一个下划线或连字符以外的字符

#2


0  

you can try the regex

你可以试试regex

^(?=[^-_])(\w+[-_]\w*)+(?=[^-_])\w$

see the demo here.

看到演示。

#3


-1  

Switch _- to -_ so that - is not treated as a range op, as in a-z.

开关_- to -_,这样就不会像在a-z中那样被当作一个范围操作。

智能推荐

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2017/08/20/4cbc49f64df6f5e326cffa0c102fd997.html



 
© 2014-2019 ITdaan.com 粤ICP备14056181号  

赞助商广告