問題描述
我有一個可能有不同換行樣式的文本.我想用相同的換行符替換所有換行符 ' ', ' ',' ' (在本例中為 ).
I have a text which might have different newline styles. I want to replace all newlines ' ', ' ',' ' with the same newline (in this case ).
最快的方法是什么?我目前的解決方案看起來很糟糕:
What's the fastest way to do this? My current solution looks like this which is way sucky:
$sNicetext = str_replace("
",'%%%%somthing%%%%', $sNicetext);
$sNicetext = str_replace(array("
","
"),array("
","
"), $sNicetext);
$sNicetext = str_replace('%%%%somthing%%%%',"
", $sNicetext);
問題是你不能用一次替換來做到這一點,因為 將被復制到 .
Problem is that you can't do this with one replace because the will be duplicated to .
感謝您的幫助!
推薦答案
$string = preg_replace('~R~u', "
", $string);
如果您不想替換所有 Unicode 換行符而只想替換 CRLF 樣式的換行符,請使用:
If you don't want to replace all Unicode newlines but only CRLF style ones, use:
$string = preg_replace('~(*BSR_ANYCRLF)R~', "
", $string);
R
匹配這些換行符,u
是將輸入字符串視為 UTF-8 的修飾符.
R
matches these newlines, u
is a modifier to treat the input string as UTF-8.
來自 PCRE 文檔:
什么R
匹配
What
R
matches
默認情況下,模式中的序列 R 匹配任何 Unicode 換行符序列,無論被選為行尾序列.如果你指定
By default, the sequence R in a pattern matches any Unicode newline sequence, whatever has been selected as the line ending sequence. If you specify
--enable-bsr-anycrlf
默認值已更改,以便 R 僅匹配 CR、LF 或 CRLF.構建 PCRE 時選擇的任何內容都可以在庫時被覆蓋函數被調用.
the default is changed so that R matches only CR, LF, or CRLF. Whatever is selected when PCRE is built can be overridden when the library functions are called.
和
換行符序列
在字符類之外,默認情況下,轉義序列 R 匹配任何 Unicode 換行序列.在非 UTF-8 模式下,R 等價于以下:
Outside a character class, by default, the escape sequence R matches any Unicode newline sequence. In non-UTF-8 mode R is equivalent to the following:
(?>
|
|x0b|f|
|x85)
這是一個原子組"的例子,給出了詳細信息以下.此特定組匹配兩個字符的序列CR 后跟 LF,或單個字符 LF 之一(換行、U+000A)、VT(垂直標簽、U+000B)、FF(換頁、U+000C)、CR(托架返回,U+000D)或 NEL(下一行,U+0085).兩個字符的序列被視為一個不可分割的單元.
This is an example of an "atomic group", details of which are given below. This particular group matches either the two-character sequence CR followed by LF, or one of the single characters LF (linefeed, U+000A), VT (vertical tab, U+000B), FF (formfeed, U+000C), CR (carriage return, U+000D), or NEL (next line, U+0085). The two-character sequence is treated as a single unit that cannot be split.
在 UTF-8 模式下,代碼點更大的兩個附加字符添加超過 255 個:LS(行分隔符,U+2028)和 PS(段落分隔符,U+2029).不需要 Unicode 字符屬性支持這些字符被識別.
In UTF-8 mode, two additional characters whose codepoints are greater than 255 are added: LS (line separator, U+2028) and PS (paragraph separator, U+2029). Unicode character property support is not needed for these characters to be recognized.
可以限制 R 只匹配 CR、LF 或 CRLF(而不是完整的 Unicode 行尾集)通過設置選項PCRE_BSR_ANYCRLF 在編譯時或模式匹配時.(BSR 是反斜杠 R"的縮寫.)這可以設為默認值PCRE 構建時;如果是這種情況,其他行為可以是通過 PCRE_BSR_UNICODE 選項請求.也可以通過使用以下選項之一啟動模式字符串來指定這些設置以下序列:
It is possible to restrict R to match only CR, LF, or CRLF (instead of the complete set of Unicode line endings) by setting the option PCRE_BSR_ANYCRLF either at compile time or when the pattern is matched. (BSR is an abbrevation for "backslash R".) This can be made the default when PCRE is built; if this is the case, the other behaviour can be requested via the PCRE_BSR_UNICODE option. It is also possible to specify these settings by starting a pattern string with one of the following sequences:
(*BSR_ANYCRLF) CR, LF, or CRLF only
(*BSR_UNICODE) any Unicode newline sequence
這些覆蓋默認值和提供給 pcre_compile() 或的選項pcre_compile2(),但它們可以被提供給的選項覆蓋pcre_exec() 或 pcre_dfa_exec().請注意,這些特殊設置,其中與 Perl 不兼容,僅在開始時被識別模式,并且它們必須是大寫的.如果其中不止一個存在,則使用最后一個.它們可以結合改變換行約定;例如,一個模式可以以:
These override the default and the options given to pcre_compile() or pcre_compile2(), but they can be overridden by options given to pcre_exec() or pcre_dfa_exec(). Note that these special settings, which are not Perl-compatible, are recognized only at the very start of a pattern, and that they must be in upper case. If more than one of them is present, the last one is used. They can be combined with a change of newline convention; for example, a pattern can start with:
(*ANY)(*BSR_ANYCRLF)
它們也可以與 (*UTF8) 或 (*UCP) 特殊序列組合.在字符類中,R 被視為無法識別的轉義序列,因此默認匹配字母R",但會導致錯誤如果設置了 PCRE_EXTRA.
They can also be combined with the (*UTF8) or (*UCP) special sequences. Inside a character class, R is treated as an unrecognized escape sequence, and so matches the letter "R" by default, but causes an error if PCRE_EXTRA is set.
這篇關于如何以最聰明的方式替換 PHP 中不同的換行樣式?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!