为什么从std :: istream读取logging结构字段失败，我该如何解决？

假设我们有以下情况：

一个logging结构声明如下

struct Person { unsigned int id; std::string name; uint8_t age; // ... };

logging使用以下格式存储在文件中：

 ID Forename Lastname Age ------------------------------ 1267867 John Smith 32 67545 Jane Doe 36 8677453 Gwyneth Miller 56 75543 J. Ross Unusual 23 ...

应读入文件以收集任意数量的上述Personlogging：

 std::istream& ifs = std::ifstream("SampleInput.txt"); std::vector<Person> persons; Person actRecord; while(ifs >> actRecord.id >> actRecord.name >> actRecord.age) { persons.push_back(actRecord); } if(!ifs) { std::err << "Input format error!" << std::endl; }

问题:(这是一个常见的问题，在这个或那个forms）
我能做些什么来读取单独的值存储到一个actRecordvariables的字段？

上面的代码示例结束了运行时错误：

 Runtime error time: 0 memory: 3476 signal:-1 stderr: Input format error!

一个可行的解决scheme是重新sortinginput字段（如果这是可能的话）

 ID Age Forename Lastname 1267867 32 John Smith 67545 36 Jane Doe 8677453 56 Gwyneth Miller 75543 23 J. Ross Unusual ...

并按以下方式读入logging

 #include <iostream> #include <vector> struct Person { unsigned int id; std::string name; uint8_t age; // ... }; int main() { std::istream& ifs = std::cin; // Open file alternatively std::vector<Person> persons; Person actRecord; unsigned int age; while(ifs >> actRecord.id >> age && std::getline(ifs, actRecord.name)) { actRecord.age = uint8_t(age); persons.push_back(actRecord); } return 0; }

名字和姓氏之间有空格。改变你的类有名字和姓氏作为单独的string，它应该工作。你可以做的另一件事是读取两个单独的variables，如name1和name2并将其赋值为

 actRecord.name = name1 + " " + name2;

下面是我提出的一个操纵器的实现，它通过每个提取的字符来计算分隔符。使用您指定的分隔符数量，它将从inputstream中提取单词。这是一个工作演示。

 template<class charT> struct word_inserter_impl { word_inserter_impl(std::size_t words, std::basic_string<charT>& str, charT delim) : str_(str) , delim_(delim) , words_(words) { } friend std::basic_istream<charT>& operator>>(std::basic_istream<charT>& is, const word_inserter_impl<charT>& wi) { typename std::basic_istream<charT>::sentry ok(is); if (ok) { std::istreambuf_iterator<charT> it(is), end; std::back_insert_iterator<std::string> dest(wi.str_); while (it != end && wi.words_) { if (*it == wi.delim_ && --wi.words_ == 0) { break; } dest++ = *it++; } } return is; } private: std::basic_string<charT>& str_; charT delim_; mutable std::size_t words_; }; template<class charT=char> word_inserter_impl<charT> word_inserter(std::size_t words, std::basic_string<charT>& str, charT delim = charT(' ')) { return word_inserter_impl<charT>(words, str, delim); }

现在你可以做：

 while (ifs >> actRecord.id >> word_inserter(2, actRecord.name) >> actRecord.age) { std::cout << actRecord.id << " " << actRecord.name << " " << actRecord.age << '\n'; }

现场演示

解决办法是将第一个条目读入IDvariables。
然后从行中读出所有其他的单词（只要将它们推入一个临时向量），然后用除了最后一个年龄条目之外的所有元素构造个体的名称。

这可以让你在最后一个位置仍然有年龄，但能够处理像“J.罗斯不寻常”这样的名字。

更新添加一些代码，说明上述理论：

 #include <memory> #include <string> #include <vector> #include <iterator> #include <fstream> #include <sstream> #include <iostream> struct Person { unsigned int id; std::string name; int age; }; int main() { std::fstream ifs("in.txt"); std::vector<Person> persons; std::string line; while (std::getline(ifs, line)) { std::istringstream iss(line); // first: ID simply read it Person actRecord; iss >> actRecord.id; // next iteration: read in everything std::string temp; std::vector<std::string> tempvect; while(iss >> temp) { tempvect.push_back(temp); } // then: the name, let's join the vector in a way to not to get a trailing space // also taking care of people who do not have two names ... int LAST = 2; if(tempvect.size() < 2) // only the name and age are in there { LAST = 1; } std::ostringstream oss; std::copy(tempvect.begin(), tempvect.end() - LAST, std::ostream_iterator<std::string>(oss, " ")); // the last element oss << *(tempvect.end() - LAST); actRecord.name = oss.str(); // and the age actRecord.age = std::stoi( *(tempvect.end() - 1) ); persons.push_back(actRecord); } for(std::vector<Person>::const_iterator it = persons.begin(); it != persons.end(); it++) { std::cout << it->id << ":" << it->name << ":" << it->age << std::endl; } }

由于我们可以很容易地在空格中分割一行，我们知道唯一可以分隔的值就是名字，所以可能的解决scheme是对包含行的空白分隔元素的每一行使用一个双端队列。 id和age可以很容易地从deque中获取，其余的元素可以被连接起来以获取名字：

 #include <iostream> #include <fstream> #include <deque> #include <vector> #include <sstream> #include <iterator> #include <string> #include <algorithm> #include <utility> struct Person { unsigned int id; std::string name; uint8_t age; };

 int main(int argc, char* argv[]) { std::ifstream ifs("SampleInput.txt"); std::vector<Person> records; std::string line; while (std::getline(ifs,line)) { std::istringstream ss(line); std::deque<std::string> info(std::istream_iterator<std::string>(ss), {}); Person record; record.id = std::stoi(info.front()); info.pop_front(); record.age = std::stoi(info.back()); info.pop_back(); std::ostringstream name; std::copy ( info.begin() , info.end() , std::ostream_iterator<std::string>(name," ")); record.name = name.str(); record.name.pop_back(); records.push_back(std::move(record)); } for (auto& record : records) { std::cout << record.id << " " << record.name << " " << static_cast<unsigned int>(record.age) << std::endl; } return 0; }

另一种解决方法是为特定的字段需要特定的分隔字符，并为此提供一个特殊的提取操纵器。

假设我们定义了分隔符" ，input应该如下所示：

 1267867 "John Smith" 32 67545 "Jane Doe" 36 8677453 "Gwyneth Miller" 56 75543 "J. Ross Unusual" 23

一般需要包括：

 #include <iostream> #include <vector> #include <iomanip>

logging声明：

 struct Person { unsigned int id; std::string name; uint8_t age; // ... };

支持与std::istream& operator>>(std::istream&, const delim_field_extractor_proxy&)全局运算符重载的代理类（struct）的声明/定义：

 struct delim_field_extractor_proxy { delim_field_extractor_proxy ( std::string& field_ref , char delim = '"' ) : field_ref_(field_ref), delim_(delim) {} friend std::istream& operator>> ( std::istream& is , const delim_field_extractor_proxy& extractor_proxy); void extract_value(std::istream& is) const { field_ref_.clear(); char input; bool addChars = false; while(is) { is.get(input); if(is.eof()) { break; } if(input == delim_) { addChars = !addChars; if(!addChars) { break; } else { continue; } } if(addChars) { field_ref_ += input; } } // consume whitespaces while(std::isspace(is.peek())) { is.get(); } } std::string& field_ref_; char delim_; };

 std::istream& operator>> ( std::istream& is , const delim_field_extractor_proxy& extractor_proxy) { extractor_proxy.extract_value(is); return is; }

pipe道连接在一起，并实例化delim_field_extractor_proxy ：

 int main() { std::istream& ifs = std::cin; // Open file alternatively std::vector<Person> persons; Person actRecord; int act_age; while(ifs >> actRecord.id >> delim_field_extractor_proxy(actRecord.name,'"') >> act_age) { actRecord.age = uint8_t(act_age); persons.push_back(actRecord); } for(auto it = persons.begin(); it != persons.end(); ++it) { std::cout << it->id << ", " << it->name << ", " << int(it->age) << std::endl; } return 0; }

看到这里的工作示例。

注意：
此解决scheme也可以很好地将制表符（ \t ）指定为分隔符，这对parsing标准的.csv格式非常有用。

我能做些什么来形成一个actRecord.namevariables形成名称的actRecord.name ？

一般的答案是： 不可以 ，如果没有额外的分隔符规范和对形成预期的actRecord.name内容的部分的特殊parsing，你不能做到这一点。
这是因为一个std::string字段将被分析到下一个空白字符的发生。

值得注意的是，某些标准格式（例如.csv ）可能需要支持从制表符（ '\t' ）或其他字符区分空白（ ' ' ）以划定某些logging字段（乍一看可能不可见）。

另请注意：
要将uint8_t值作为数字input读取，必须使用临时的unsigned int值进行偏移。只读取一个unsigned char （aka uint8_t ）会搞砸streamparsing状态。

解决分析问题的另一个尝试。

 int main() { std::ifstream ifs("test-115.in"); std::vector<Person> persons; while (true) { Person actRecord; // Read the ID and the first part of the name. if ( !(ifs >> actRecord.id >> actRecord.name ) ) { break; } // Read the rest of the line. std::string line; std::getline(ifs,line); // Pickup the rest of the name from the rest of the line. // The last token in the rest of the line is the age. // All other tokens are part of the name. // The tokens can be separated by ' ' or '\t'. size_t pos = 0; size_t iter1 = 0; size_t iter2 = 0; while ( (iter1 = line.find(' ', pos)) != std::string::npos || (iter2 = line.find('\t', pos)) != std::string::npos ) { size_t iter = (iter1 != std::string::npos) ? iter1 : iter2; actRecord.name += line.substr(pos, (iter - pos + 1)); pos = iter + 1; // Skip multiple whitespace characters. while ( isspace(line[pos]) ) { ++pos; } } // Trim the last whitespace from the name. actRecord.name.erase(actRecord.name.size()-1); // Extract the age. // std::stoi returns an integer. We are assuming that // it will be small enough to fit into an uint8_t. actRecord.age = std::stoi(line.substr(pos).c_str()); // Debugging aid.. Make sure we have extracted the data correctly. std::cout << "ID: " << actRecord.id << ", name: " << actRecord.name << ", age: " << (int)actRecord.age << std::endl; persons.push_back(actRecord); } // If came here before the EOF was reached, there was an // error in the input file. if ( !(ifs.eof()) ) { std::cerr << "Input format error!" << std::endl; } }

当看到这样一个input文件时，我认为它不是一个（新的）分隔文件，而是一个好的固定大小的字段，就像Fortran和Cobol程序员用来处理的那样。所以我会parsing它（注意我分开的名字和姓氏）：

 #include <iostream> #include <fstream> #include <sstream> #include <string> #include <vector> struct Person { unsigned int id; std::string forename; std::string lastname; uint8_t age; // ... }; int main() { std::istream& ifs = std::ifstream("file.txt"); std::vector<Person> persons; std::string line; int fieldsize[] = {8, 9, 9, 4}; while(std::getline(ifs, line)) { Person person; int field = 0, start=0, last; std::stringstream fieldtxt; fieldtxt.str(line.substr(start, fieldsize[0])); fieldtxt >> person.id; start += fieldsize[0]; person.forename=line.substr(start, fieldsize[1]); last = person.forename.find_last_not_of(' ') + 1; person.forename.erase(last); start += fieldsize[1]; person.lastname=line.substr(start, fieldsize[2]); last = person.lastname.find_last_not_of(' ') + 1; person.lastname.erase(last); start += fieldsize[2]; std::string a = line.substr(start, fieldsize[3]); fieldtxt.str(line.substr(start, fieldsize[3])); fieldtxt >> age; person.age = person.age; persons.push_back(person); } return 0; }

为什么从std :: istream读取logging结构字段失败，我该如何解决？

自定义容器应该有免费的开始/结束function？

很好的方式来追加向量自己

将成员函数从基类移动到派生类会导致程序无法正常运行

to_string不是std的成员，说g ++（mingw）

在C ++中进行函数式编程。实施f（a）（b）（c）

为什么我会使用push_back而不是emplace_back？

在C ++中执行语句顺序

std :: function如何工作

什么是const void？

为什么std :: declval添加引用？

为什么从std :: istream读取logging结构字段失败，我该如何解决？

自定义容器应该有免费的开始/结束function？

很好的方式来追加向量自己

将成员函数从基类移动到派生类会导致程序无法正常运行

to_string不是std的成员，说g ++（mingw）

在C ++中进行函数式编程。 实施f（a）（b）（c）

为什么我会使用push_back而不是emplace_back？

在C ++中执行语句顺序

std :: function如何工作

什么是const void？

为什么std :: declval添加引用？

在C ++中进行函数式编程。实施f（a）（b）（c）