問題描述
我有這個 csv 行
std::string s = R"(1997,Ford,E350,"ac, abs, moon","some "rusty" parts",3000.00)";
我可以使用 boost::tokenizer
解析它:
I can parse it using boost::tokenizer
:
typedef boost::tokenizer< boost::escaped_list_separator<char> , std::string::const_iterator, std::string> Tokenizer;
boost::escaped_list_separator<char> seps('\', ',', '"');
Tokenizer tok(s, seps);
for (auto i : tok)
{
std::cout << i << std::endl;
}
它是正確的,除了標記生銹"應該有雙引號被剝離.
It gets it right except token "rusty" should have double quotes which are getting stripped.
這是我嘗試使用 boost::spirit
Here is my attempt to use boost::spirit
boost::spirit::classic::rule<> list_csv_item = !(boost::spirit::classic::confix_p('"', *boost::spirit::classic::c_escape_ch_p, '"') | boost::spirit::classic::longest_d[boost::spirit::classic::real_p | boost::spirit::classic::int_p]);
std::vector<std::string> vec_item;
std::vector<std::string> vec_list;
boost::spirit::classic::rule<> list_csv = boost::spirit::classic::list_p(list_csv_item[boost::spirit::classic::push_back_a(vec_item)],',')[boost::spirit::classic::push_back_a(vec_list)];
boost::spirit::classic::parse_info<> result = parse(s.c_str(), list_csv);
if (result.hit)
{
for (auto i : vec_item)
{
cout << i << endl;
}
}
問題:
不起作用,只打印第一個標記
does not work, prints the first token only
為什么是 boost::spirit::classic?找不到使用 Spirit V2 的示例
why boost::spirit::classic? can't find examples using Spirit V2
設置很殘酷..但我可以忍受
the setup is brutal .. but I can live with this
** 我真的很想使用 boost::spirit
因為它往往很快
** I really want to use boost::spirit
because it tends to be pretty fast
預期輸出:
1997
Ford
E350
ac, abs, moon
some "rusty" parts
3000.00
推薦答案
Sehe 的帖子看起來比我的要干凈一些,但我把它放在一起了一段時間,所以無論如何都在這里:
Sehe's post looks a fair bit cleaner than mine, but I was putting this together for a bit, so here it is anyways:
#include <boost/tokenizer.hpp>
#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;
int main() {
const std::string s = R"(1997,Ford,E350,"ac, abs, moon",""rusty"",3000.00)";
// Tokenizer
typedef boost::tokenizer< boost::escaped_list_separator<char> , std::string::const_iterator, std::string> Tokenizer;
boost::escaped_list_separator<char> seps('\', ',', '"');
Tokenizer tok(s, seps);
for (auto i : tok)
std::cout << i << "
";
std::cout << "
";
// Boost Spirit Qi
qi::rule<std::string::const_iterator, std::string()> quoted_string = '"' >> *(qi::char_ - '"') >> '"';
qi::rule<std::string::const_iterator, std::string()> valid_characters = qi::char_ - '"' - ',';
qi::rule<std::string::const_iterator, std::string()> item = *(quoted_string | valid_characters );
qi::rule<std::string::const_iterator, std::vector<std::string>()> csv_parser = item % ',';
std::string::const_iterator s_begin = s.begin();
std::string::const_iterator s_end = s.end();
std::vector<std::string> result;
bool r = boost::spirit::qi::parse(s_begin, s_end, csv_parser, result);
assert(r == true);
assert(s_begin == s_end);
for (auto i : result)
std::cout << i << std::endl;
std::cout << "
";
}
這輸出:
1997
Ford
E350
ac, abs, moon
rusty
3000.00
1997
Ford
E350
ac, abs, moon
rusty
3000.00
值得注意的事情:這沒有實現完整的 CSV 解析器.您還需要研究轉義字符或其他實現所需的任何內容.
Something Worth Noting: This doesn't implement a full CSV parser. You'd also want to look into escape characters or whatever else is required for your implementation.
另外:如果您正在查看文檔,那么您就會知道,在 Qi 中,'a'
等效于 boost::spirit::qi::lit('a')
和 "abc"
等價于 boost::spirit::qi::lit("abc")
.
Also: If you're looking into the documentation, just so you know, in Qi, 'a'
is equivalent to boost::spirit::qi::lit('a')
and "abc"
is equivalent to boost::spirit::qi::lit("abc")
.
關于雙引號:因此,正如 Sehe 在上面的評論中指出的那樣,輸入文本中圍繞 ""
的規則并不直接清楚意味著什么.如果您希望所有不在帶引號的字符串中的 ""
實例都轉換為 "
,那么類似下面的內容將起作用.
On Double quotes: So, as Sehe notes in a comment above, it's not directly clear what the rules surrounding a ""
in the input text means. If you wanted all instances of ""
not within a quoted string to be converted to a "
, then something like the following would work.
qi::rule<std::string::const_iterator, std::string()> double_quote_char = """" >> qi::attr('"');
qi::rule<std::string::const_iterator, std::string()> item = *(double_quote_char | quoted_string | valid_characters );
這篇關于如何使用 boost::spirit 解析 csv的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!