問題描述
我正在嘗試用不同的替換模式替換字符串中的某些模式.
I am trying to replace certain patterns in a string with different replacement patters.
示例:
string test = "test replacing "these characters"";
我想要做的是將所有的 ' ' 替換為 '_',并將所有其他非字母或數(shù)字字符替換為空字符串.我創(chuàng)建了以下正則表達式,它似乎正確標記化,但我不確定如何(如果可能)使用 regex_replace
執(zhí)行條件替換.
What I want to do is replace all ' ' with '_' and all other non letter or number characters with an empty string. I have the following regex created and it seems to tokenize correctly, but I am not sure how to (if possible) perform a conditional replace using regex_replace
.
string test = "test replacing "these characters"";
regex reg("(\s+)|(\W+)");
替換后的預(yù)期結(jié)果是:
string result = "test_replacing_these_characters";
我不能使用 boost,這就是為什么我把它排除在標簽之外.所以請不要回答包括提升.我必須用標準庫來做到這一點.可能是不同的正則表達式可以實現(xiàn)目標,或者我只是堅持做兩次.
I cannot use boost, which is why I left it out of the tags. So please no answer that includes boost. I have to do this with the standard library. It may be that a different regex would accomplish the goal or that I am just stuck doing two passes.
我不記得在我原來的正則表達式時 w
中包含了哪些字符,在查找之后我進一步簡化了表達式.同樣,目標是任何匹配 s+ 的內(nèi)容都應(yīng)替換為 '_',任何匹配的 W+ 均應(yīng)替換為空字符串.
I did not remember what characters were included in w
at the time of my original regex, after looking it up I have further simplified the expression. Again the goal is anything matching s+ should be replaced with '_' and anything matching W+ should be replaced with empty string.
推薦答案
C++ (0x, 11, tr1) 正則表達式 不要在每種情況下都確實有效(stackoverflow)(查找 此頁面 上的短語 regex 用于 gcc),因此最好使用 boost 一段時間.
The c++ (0x, 11, tr1) regular expressions do not really work (stackoverflow) in every case (look up the phrase regex on this page for gcc), so it is better to use boost for a while.
你可以試試你的編譯器是否支持所需的正則表達式:
You may try if your compiler supports the regular expressions needed:
#include <string>
#include <iostream>
#include <regex>
using namespace std;
int main(int argc, char * argv[]) {
string test = "test replacing "these characters"";
regex reg("[^\w]+");
test = regex_replace(test, reg, "_");
cout << test << endl;
}
以上適用于 Visual Studio 2012Rc.
The above works in Visual Studio 2012Rc.
編輯 1:要在一次傳遞中替換兩個不同的字符串(取決于匹配),我認為這在這里不起作用.在 Perl 中,這可以在計算的替換表達式(/e
開關(guān))中輕松完成.
Edit 1: To replace by two different strings in one pass (depending on the match), I'd think this won't work here. In Perl, this could easily be done within evaluated replacement expressions (/e
switch).
因此,正如您已經(jīng)懷疑的那樣,您需要兩次通過:
Therefore, you'll need two passes, as you already suspected:
...
string test = "test replacing "these characters"";
test = regex_replace(test, regex("\s+"), "_");
test = regex_replace(test, regex("\W+"), "");
...
編輯 2:
如果可以在 regex_replace
中使用 回調(diào)函數(shù) tr()
,那么您可以修改那里的替換,例如:
If it would be possible to use a callback function tr()
in regex_replace
, then you could modify the substitution there, like:
string output = regex_replace(test, regex("\s+|\W+"), tr);
用 tr()
做替換工作:
string tr(const smatch &m) { return m[0].str()[0] == ' ' ? "_" : ""; }
問題就解決了.不幸的是,在某些 C++11 正則表達式實現(xiàn)中沒有這樣的重載,但是 Boost 有一個.以下將與 boost 一起使用并使用一次傳遞:
the problem would have been solved. Unfortunately, there's no such overload in some C++11 regex implementations, but Boost has one. The following would work with boost and use one pass:
...
#include <boost/regex.hpp>
using namespace boost;
...
string tr(const smatch &m) { return m[0].str()[0] == ' ' ? "_" : ""; }
...
string test = "test replacing "these characters"";
test = regex_replace(test, regex("\s+|\W+"), tr); // <= works in Boost
...
也許有一天這將適用于 C++11 或接下來的任何數(shù)字.
Maybe some day this will work with C++11 or whatever number comes next.
問候
rbo
這篇關(guān)于有條件地替換字符串中的正則表達式匹配項的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!