我有一些代码(采用并改编自
here和
here),它使用libclang在Python(Widnows)中解析C源文件并获取其所有声明语句,如下所示:
import clang.cindex def parse_decl(node): reference_node = node.get_definition() if node.kind.is_declaration(): print(node.kind,node.kind.name,node.location.line,',node.location.column,reference_node.displayname) for ch in node.get_children(): parse_decl(ch) # configure path clang.cindex.Config.set_library_file('C:/Program Files (x86)/LLVM/bin/libclang.dll') index = clang.cindex.Index.create() trans_unit = index.parse(r'C:\path\to\sourcefile\test.cpp',args=['-std=c++11']) parse_decl(trans_unit.cursor)
对于以下C源文件(test_ok.cpp):
/* test_ok.cpp */ #include <iostream> #include <fstream> #include <string> #include <algorithm> #include <cmath> #include <iomanip> using namespace std; int main (int argc,char *argv[]) { int linecount = 0; double array[1000],sum=0,median=0,add=0; string filename; if (argc <= 1) { cout << "Error: no filename specified" << endl; return 0; } //program checks if a filename is specified filename = argv[1]; ifstream myfile (filename.c_str()); if (myfile.is_open()) { myfile >> array[linecount]; while ( myfile.good() ) { linecount++; myfile >> array[linecount]; } myfile.close(); }
CursorKind.USING_DIRECTIVE USING_DIRECTIVE 10,17 std CursorKind.FUNCTION_DECL FUNCTION_DECL 12,5 main(int,char **) CursorKind.PARM_DECL PARM_DECL 12,15 argc CursorKind.PARM_DECL PARM_DECL 12,27 argv CursorKind.VAR_DECL VAR_DECL 13,7 linecount CursorKind.VAR_DECL VAR_DECL 14,10 array CursorKind.VAR_DECL VAR_DECL 14,23 sum CursorKind.VAR_DECL VAR_DECL 14,30 median CursorKind.VAR_DECL VAR_DECL 14,40 add CursorKind.VAR_DECL VAR_DECL 15,10 filename CursorKind.VAR_DECL VAR_DECL 23,12 myfile Process finished with exit code 0
然而,
对于以下C源文件(test.cpp):
/* test.cpp */ #include <iostream> #include <vector> #include <fstream> #include <cmath> #include <algorithm> #include <iomanip> using namespace std; void readfunction(vector<double>& numbers,ifstream& myfile) { double number; while (myfile >> number) { numbers.push_back(number);} } double meanfunction(vector<double>& numbers) { double total=0; vector<double>::const_iterator i; for (i=numbers.begin(); i!=numbers.end(); ++i) { total +=*i; } return total/numbers.size(); }
解析不完整:
CursorKind.USING_DIRECTIVE USING_DIRECTIVE 8,17 std CursorKind.VAR_DECL VAR_DECL 10,6 readfunction Process finished with exit code 0
解析无法处理诸如vector< double>&数字等,并停止解析该部分代码.
我相信问题与另一个SO question中描述的问题类似.我试图明确使用std = c 11 parse参数但没有成功.在该问题的answer中(即使它没有解决问题)也建议使用-x c,但我不知道如何在上面的代码中添加它.
任何人都可以指出libclang的解决方案来解析像test.cpp中那样的C语句吗?
此外,我可以这样做,所以它会继续解析,即使它到达一个令牌无法解析?
解决方法
默认情况下,libclang不会添加编译器系统包含路径.
始终确保您已检查过诊断 – 如编译器错误消息,它们倾向于指示如何解决任何问题.在这种情况下,显然存在一个包含问题:
<Diagnostic severity 4,location <SourceLocation file 'test.cpp',line 3,column 10>,spelling "'iostream' file not found">
如果确保libclang添加了这些路径,它应该开始工作.
This question包括解决此问题的方法.这似乎是Stackoverflow上反复出现的主题,所以我写了ccsyspath以帮助在OSX,Linux和Windows上找到这些路径.稍微简化您的代码:
import clang.cindex clang.cindex.Config.set_library_file('C:/Program Files (x86)/LLVM/bin/libclang.dll') import ccsyspath index = clang.cindex.Index.create() args = '-x c++ --std=c++11'.split() syspath = ccsyspath.system_include_paths('clang++') incargs = [ b'-I' + inc for inc in syspath ] args = args + incargs trans_unit = index.parse('test.cpp',args=args) for node in trans_unit.cursor.walk_preorder(): if node.location.file is None: continue if node.location.file.name != 'test.cpp': continue if node.kind.is_declaration(): print(node.kind,node.location)
我的args最终成为:
['-x','c++','--std=c++11','-IC:\\Program Files (x86)\\LLVM\\bin\\..\\lib\\clang\\3.8.0\\include','-IC:\\Program Files (x86)\\Microsoft Visual Studio 12.0\\VC\\include','-IC:\\Program Files (x86)\\Windows Kits\\8.1\\include\\shared','-IC:\\Program Files (x86)\\Windows Kits\\8.1\\include\\um','-IC:\\Program Files (x86)\\Windows Kits\\8.1\\include\\winrt']
输出是:
(CursorKind.USING_DIRECTIVE,<SourceLocation file 'test.cpp',line 10,column 17>) (CursorKind.FUNCTION_DECL,line 12,column 6>) (CursorKind.PARM_DECL,column 35>) (CursorKind.PARM_DECL,column 54>) (CursorKind.VAR_DECL,line 15,column 14>) (CursorKind.FUNCTION_DECL,line 21,column 8>) (CursorKind.PARM_DECL,column 37>) (CursorKind.VAR_DECL,line 24,column 14>) (CursorKind.VAR_DECL,line 25,column 40>)