如何使用vba解析XML
我在VBA工作,并想分析一个字符串例如
<PointN xsi:type='typens:PointN' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xs='http://www.w3.org/2001/XMLSchema'> <X>24.365</X> <Y>78.63</Y> </PointN>
并将X和Y值分成两个独立的整数变量。
我是XML的新手,因为我工作在这个领域,所以我被困在VB6和VBA中。
我如何做到这一点?
这是一个复杂的问题,但似乎最直接的路线是通过MSXML2.DOMDocument加载XML文档或XML字符串,然后允许您访问XML节点。
您可以在以下站点上找到有关MSXML2.DOMDocument的更多信息:
- 使用Excel VBA和Xpath操作XML文件
- MSXML – http://msdn.microsoft.com/en-us/library/ms763742(VS.85).aspx
- MSXML 4.0概述
感谢指针。
我不知道,这是否是解决问题的最好办法,但是这是我如何运作的。 我引用了我的VBA中的Microsoft XML v2.6 dll,然后下面的代码片断给了我所需的值
Dim objXML As MSXML2.DOMDocument Set objXML = New MSXML2.DOMDocument If Not objXML.loadXML(strXML) Then 'strXML is the string with XML' Err.Raise objXML.parseError.ErrorCode, , objXML.parseError.reason End If Dim point As IXMLDOMNode Set point = objXML.firstChild Debug.Print point.selectSingleNode("X").Text Debug.Print point.selectSingleNode("Y").Text
这是使用FeedDemon opml文件的示例OPML分析器:
Sub debugPrintOPML() ' http://msdn.microsoft.com/en-us/library/ms763720(v=VS.85).aspx ' http://msdn.microsoft.com/en-us/library/system.xml.xmlnode.selectnodes.aspx ' http://msdn.microsoft.com/en-us/library/ms256086(v=VS.85).aspx ' expressions ' References: Microsoft XML Dim xmldoc As New DOMDocument60 Dim oNodeList As IXMLDOMSelection Dim oNodeList2 As IXMLDOMSelection Dim curNode As IXMLDOMNode Dim n As Long, n2 As Long, x As Long Dim strXPathQuery As String Dim attrLength As Byte Dim FilePath As String FilePath = "rss.opml" xmldoc.Load CurrentProject.Path & "\" & FilePath strXPathQuery = "opml/body/outline" Set oNodeList = xmldoc.selectNodes(strXPathQuery) For n = 0 To (oNodeList.length - 1) Set curNode = oNodeList.Item(n) attrLength = curNode.Attributes.length If attrLength > 1 Then ' or 2 or 3 Call processNode(curNode) Else Call processNode(curNode) strXPathQuery = "opml/body/outline[position() = " & n + 1 & "]/outline" Set oNodeList2 = xmldoc.selectNodes(strXPathQuery) For n2 = 0 To (oNodeList2.length - 1) Set curNode = oNodeList2.Item(n2) Call processNode(curNode) Next End If Debug.Print "----------------------" Next Set xmldoc = Nothing End Sub Sub processNode(curNode As IXMLDOMNode) Dim sAttrName As String Dim sAttrValue As String Dim attrLength As Byte Dim x As Long attrLength = curNode.Attributes.length For x = 0 To (attrLength - 1) sAttrName = curNode.Attributes.Item(x).nodeName sAttrValue = curNode.Attributes.Item(x).nodeValue Debug.Print sAttrName & " = " & sAttrValue Next Debug.Print "-----------" End Sub
这一个文件夹(Awasu,NewzCrawler)的多级树:
... Call xmldocOpen4 Call debugPrintOPML4(Null) ... Dim sText4 As String Sub debugPrintOPML4(strXPathQuery As Variant) Dim xmldoc4 As New DOMDocument60 'Dim xmldoc4 As New MSXML2.DOMDocument60 ' ? Dim oNodeList As IXMLDOMSelection Dim curNode As IXMLDOMNode Dim n4 As Long If IsNull(strXPathQuery) Then strXPathQuery = "opml/body/outline" ' http://msdn.microsoft.com/en-us/library/ms754585(v=VS.85).aspx xmldoc4.async = False xmldoc4.loadXML sText4 If (xmldoc4.parseError.errorCode <> 0) Then Dim myErr Set myErr = xmldoc4.parseError MsgBox ("You have error " & myErr.reason) Else ' MsgBox xmldoc4.xml End If Set oNodeList = xmldoc4.selectNodes(strXPathQuery) For n4 = 0 To (oNodeList.length - 1) Set curNode = oNodeList.Item(n4) Call processNode4(strXPathQuery, curNode, n4) Next Set xmldoc4 = Nothing End Sub Sub processNode4(strXPathQuery As Variant, curNode As IXMLDOMNode, n4 As Long) Dim sAttrName As String Dim sAttrValue As String Dim x As Long For x = 0 To (curNode.Attributes.length - 1) sAttrName = curNode.Attributes.Item(x).nodeName sAttrValue = curNode.Attributes.Item(x).nodeValue 'If sAttrName = "text" Debug.Print strXPathQuery & " :: " & sAttrName & " = " & sAttrValue 'End If Next Debug.Print "" If curNode.childNodes.length > 0 Then Call debugPrintOPML4(strXPathQuery & "[position() = " & n4 + 1 & "]/" & curNode.nodeName) End If End Sub Sub xmldocOpen4() Dim oFSO As New FileSystemObject ' Microsoft Scripting Runtime Reference Dim oFS Dim FilePath As String FilePath = "rss_awasu.opml" Set oFS = oFSO.OpenTextFile(CurrentProject.Path & "\" & FilePath) sText4 = oFS.ReadAll oFS.Close End Sub
或更好:
Sub xmldocOpen4() Dim FilePath As String FilePath = "rss.opml" ' function ConvertUTF8File(sUTF8File): ' http://www.vbmonster.com/Uwe/Forum.aspx/vb/24947/How-to-read-UTF-8-chars-using-VBA ' loading and conversion from Utf-8 to UTF sText8 = ConvertUTF8File(CurrentProject.Path & "\" & FilePath) End Sub
但我不明白,为什么应该每次加载xmldoc4。
添加引用项目 – >参考Microsoft XML,6.0,您可以使用示例代码:
Dim xml As String xml = "<root><person><name>Me </name> </person> <person> <name>No Name </name></person></root> " Dim oXml As MSXML2.DOMDocument60 Set oXml = New MSXML2.DOMDocument60 oXml.loadXML xml Dim oSeqNodes, oSeqNode As IXMLDOMNode Set oSeqNodes = oXml.selectNodes("//root/person") If oSeqNodes.length = 0 Then 'show some message Else For Each oSeqNode In oSeqNodes Debug.Print oSeqNode.selectSingleNode("name").Text Next End If
注意xml节点// Root / Person与// root / person不一样,还有selectSingleNode(“Name”)。text与selectSingleNode(“name”)不一样。
您可以使用XPath查询:
Dim objDom As Object '// DOMDocument Dim xmlStr As String, _ xPath As String xmlStr = _ "<PointN xsi:type='typens:PointN' " & _ "xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' " & _ "xmlns:xs='http://www.w3.org/2001/XMLSchema'> " & _ " <X>24.365</X> " & _ " <Y>78.63</Y> " & _ "</PointN>" Set objDom = CreateObject("Msxml2.DOMDocument.3.0") '// Using MSXML 3.0 '/* Load XML */ objDom.LoadXML xmlStr '/* ' * XPath Query ' */ '/* Get X */ xPath = "/PointN/X" Debug.Print objDom.SelectSingleNode(xPath).text '/* Get Y */ xPath = "/PointN/Y" Debug.Print objDom.SelectSingleNode(xPath).text
下面是一个简短的子部分,用于解析包含钢结构形状数据的MicroStation Triforma XML文件。
'location of triforma structural files 'c:\programdata\bentley\workspace\triforma\tf_imperial\data\us.xml Sub ReadTriformaImperialData() Dim txtFileName As String Dim txtFileLine As String Dim txtFileNumber As Long Dim Shape As String Shape = "w12x40" txtFileNumber = FreeFile txtFileName = "c:\programdata\bentley\workspace\triforma\tf_imperial\data\us.xml" Open txtFileName For Input As #txtFileNumber Do While Not EOF(txtFileNumber) Line Input #txtFileNumber, txtFileLine If InStr(1, UCase(txtFileLine), UCase(Shape)) Then P1 = InStr(1, UCase(txtFileLine), "D=") D = Val(Mid(txtFileLine, P1 + 3)) P2 = InStr(1, UCase(txtFileLine), "TW=") TW = Val(Mid(txtFileLine, P2 + 4)) P3 = InStr(1, UCase(txtFileLine), "WIDTH=") W = Val(Mid(txtFileLine, P3 + 7)) P4 = InStr(1, UCase(txtFileLine), "TF=") TF = Val(Mid(txtFileLine, P4 + 4)) Close txtFileNumber Exit Do End If Loop End Sub
从这里,您可以使用这些值在MicroStation 2d中绘制形状,或者以3d形式进行绘制,并将其拉伸成实体。
更新
下面介绍的过程给出了使用XML DOM对象使用VBA解析XML的示例。 代码基于XML DOM的初学者指南 。
Public Sub LoadDocument() Dim xDoc As MSXML.DOMDocument Set xDoc = New MSXML.DOMDocument xDoc.validateOnParse = False If xDoc.Load("C:\My Documents\sample.xml") Then ' The document loaded successfully. ' Now do something intersting. DisplayNode xDoc.childNodes, 0 Else ' The document failed to load. ' See the previous listing for error information. End If End Sub Public Sub DisplayNode(ByRef Nodes As MSXML.IXMLDOMNodeList, _ ByVal Indent As Integer) Dim xNode As MSXML.IXMLDOMNode Indent = Indent + 2 For Each xNode In Nodes If xNode.nodeType = NODE_TEXT Then Debug.Print Space$(Indent) & xNode.parentNode.nodeName & _ ":" & xNode.nodeValue End If If xNode.hasChildNodes Then DisplayNode xNode.childNodes, Indent End If Next xNode End Sub
Nota Bene – 这个最初的答案显示了我能想象到的最简单的事情(当时我正在研究一个非常具体的问题)。 当然,使用VBA XML Dom内置的XML功能会更好。 看到上面的更新。
原始回应
我知道这是一个很旧的帖子,但我想分享我简单的解决方案,这个复杂的问题。 主要我已经使用基本的字符串函数来访问XML数据。
这假定你有一些在VBA函数中返回的xml数据(在temp变量中)。 有趣的是,还可以看到我如何链接到一个XML Web服务来检索值。 图像中显示的函数也采用查找值,因为可以使用= FunctionName(value1,value2)从单元格内访问此Excel VBA函数,以通过Web服务将值返回到电子表格中。
openTag = "<" & tagValue & ">" closeTag = "< /" & tagValue & ">"
' Locate the position of the enclosing tags startPos = InStr(1, temp, openTag) endPos = InStr(1, temp, closeTag) startTagPos = InStr(startPos, temp, ">") + 1 ' Parse xml for returned value Data = Mid(temp, startTagPos, endPos - startTagPos)
当你不想启用宏时,通常不用VBA就可以解析。 这可以通过替换功能来完成。 输入您的开始和结束节点到单元格B1和C1。
Cell A1: {your XML here} Cell B1: <X> Cell C1: </X> Cell D1: =REPLACE(A1,1,FIND(A2,A1)+LEN(A2)-1,"") Cell E1: =REPLACE(A4,FIND(A3,A4),LEN(A4)-FIND(A3,A4)+1,"")
结果行E1将得到你的解析值:
Cell A1: {your XML here} Cell B1: <X> Cell C1: </X> Cell D1: 24.365<X><Y>78.68</Y></PointN> Cell E1: 24.365
XML解析代码
Option Explicit Dim Path As String ' input path name Dim FileName As String ' input file name Dim intColumnCount As Integer ' column counter Dim intLoop As Integer ' Looping integer Dim objDictionary As Scripting.Dictionary ' dictionary object to store column identification for id, method, query string etc Dim intPrevRequest_id As Integer 'stores previous request id Dim intCurrRequest_id As Integer 'stores current request id Dim strWholeReq As String ' Full request that is ready to be written to file Dim strStartQuotes As String ' Placeholder which holds starting double quotes Dim strEndQuotes As String ' Placeholder which holds ending double quotes Dim strStepName As String ' First line of the Parsed_XML_Function. eg Parsed_XML_Function("Step5", 'Here 5 comes from intStepNum variable Dim strUrl As String ' contains URL and Query string Dim strQueryStr As String ' Query string Dim strMethod As String ' Method part of request Dim strBody As String 'Body attributes Dim strMisc As String ' Misc items such as Resource, Snapshot number etc Dim strContentType As String ' Content type of request Dim intStepNum As Integer ' iterative count to identify step Dim objFileSys As Scripting.FileSystemObject ' file system object Dim objFile As Scripting.File 'file object Dim objTextStr As Scripting.TextStream 'text stream object Dim ActionFileName As String ' destination action name 'this funciton is the main function which calls other functions Sub Main() Path = Worksheets(1).Cells(1, 2).Value FileName = Worksheets(1).Cells(2, 2).Value ActionFileName = Worksheets(1).Cells(3, 2).Value 'open xml file Workbooks.Open FileName:=Path & "\" & FileName 'activate the workbook Windows(FileName).Activate 'delete first row Rows("1:1").Select Selection.Delete Shift:=xlUp Range(Selection, Selection.End(xlToRight)).Select ActiveSheet.Name = "PARSINGVS_XML" 'get total columns and analyze the columns intColumnCount = Worksheets("PARSINGVS_XML").UsedRange.Columns.Count Set objDictionary = New Dictionary intLoop = 1 For intLoop = 1 To intColumnCount If InStr(1, Worksheets("PARSINGVS_XML").Cells(1, intLoop), "Request/#id", 1) > 0 Then objDictionary.Add "Req_id", intLoop ElseIf InStr(1, Worksheets("PARSINGVS_XML").Cells(1, intLoop), "Request/@Method", 1) > 0 Then objDictionary.Add "Req_method", intLoop ElseIf InStr(1, Worksheets("PARSINGVS_XML").Cells(1, intLoop), "Request/@Url", 1) > 0 Then objDictionary.Add "Req_url", intLoop ElseIf InStr(1, Worksheets("PARSINGVS_XML").Cells(1, intLoop), "FormPostHttpBody/@ContentType", 1) > 0 Then objDictionary.Add "Req_contenttype", intLoop ElseIf InStr(1, Worksheets("PARSINGVS_XML").Cells(1, intLoop), "FormPostParameter/@Name", 1) > 0 Then objDictionary.Add "Req_itemdata_name", intLoop ElseIf InStr(1, Worksheets("PARSINGVS_XML").Cells(1, intLoop), "FormPostParameter/@Value", 1) > 0 Then objDictionary.Add "Req_itemdata_value", intLoop ElseIf InStr(1, Worksheets("PARSINGVS_XML").Cells(1, intLoop), "QueryStringParameter/@Name", 1) > 0 Then objDictionary.Add "Req_querystring_name", intLoop ElseIf InStr(1, Worksheets("PARSINGVS_XML").Cells(1, intLoop), "QueryStringParameter/@Value", 1) > 0 Then objDictionary.Add "Req_querystring_value", intLoop End If Next 'Loop through all requests and capture querysting, itemdata, url, method, action and content type '----------------------------------------------- 'Initialize variables ot default value at start '----------------------------------------------- intPrevRequest_id = 1 intCurrRequest_id = 1 strStartQuotes = """" strEndQuotes = """," & vbCrLf intStepNum = 1 strQueryStr = "" strBody = "" Set objFileSys = New Scripting.FileSystemObject objFileSys.CreateTextFile (Path & "\" & ActionFileName) Set objFile = objFileSys.GetFile(Path & "\" & ActionFileName) Set objTextStr = objFile.OpenAsTextStream(ForAppending, TristateUseDefault) intLoop = 2 'first line is the header For intLoop = 2 To Worksheets("PARSINGVS_XML").UsedRange.Rows.Count If objDictionary.Exists("Req_id") Then intCurrRequest_id = Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_id")).Value) Else MsgBox "XML do nto contain Request id column" Exit Sub End If 'if current and previous request id are not same OR we are at end of steps the write to file If (intPrevRequest_id <> intCurrRequest_id) Or (intLoop = Worksheets("PARSINGVS_XML").UsedRange.Rows.Count) Then Call WriteToFile 'iterate to next step intStepNum = intStepNum + 1 strQueryStr = "" strBody = "" intPrevRequest_id = intCurrRequest_id End If Call Write_Remaining_DESTINATIONVS_Req ' build the DESTINATIONVS request apart from Body & Query string Call WriteQuery_Body 'build hte body and querystring Next MsgBox "Completed" Set objDictionary = Nothing objTextStr.Close Set objTextStr = Nothing Set objFile = Nothing Set objFileSys = Nothing Windows(FileName).Close (False) End Sub 'funciton to write contents to file Sub WriteToFile() strWholeReq = strWholeReq & vbCrLf & strStepName & strUrl If strQueryStr <> "" Then strWholeReq = strWholeReq & "?" & strQueryStr End If strWholeReq = strWholeReq & strEndQuotes & strMethod & strContentType & strMisc If strBody <> "" Then strWholeReq = strWholeReq & strStartQuotes & "Body=" & strBody & strEndQuotes End If strWholeReq = strWholeReq & " LAST);" & vbCrLf objTextStr.WriteLine strWholeReq strWholeReq = "" End Sub 'function to build the querystring and body part which are iterative Sub WriteQuery_Body() If objDictionary.Exists("Req_querystring_name") Then If Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_querystring_name")).Value) <> "" Then If strQueryStr <> "" Then strQueryStr = strQueryStr & "&" End If 'Querystring strQueryStr = strQueryStr & Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_querystring_name")).Value) & "=" & _ Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_querystring_value")).Value) End If End If If objDictionary.Exists("Req_itemdata_name") Then If Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_itemdata_name")).Value) <> "" Then If strBody <> "" Then strBody = strBody & "&" End If 'Body strBody = strBody & Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_itemdata_name")).Value) & "=" & _ Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_itemdata_value")).Value) End If End If End Sub 'function which creates remaining part of web_custom request other than querystring and body Sub Write_Remaining_DESTINATIONVS_Req() 'Name of Parsed_XML_Function("Step2", strStepName = "Parsed_XML_Function(" & strStartQuotes & "Step" & intStepNum & strEndQuotes If objDictionary.Exists("Req_url") Then '"URL = " strUrl = strStartQuotes & _ "URL=" & Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_url")).Value) End If If objDictionary.Exists("Req_method") Then 'Method = strMethod = strStartQuotes & _ "Method=" & Trim(Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_method")).Value)) & strEndQuotes End If If objDictionary.Exists("Req_contenttype") Then 'ContentType = If Trim(Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_contenttype")).Value)) <> "" Then strContentType = strStartQuotes & _ "RecContentType=" & Trim(Trim(Worksheets("PARSINGVS_XML").Cells(intLoop, objDictionary("Req_contenttype")).Value)) & strEndQuotes Else strContentType = strStartQuotes & "RecContentType=text/html" & strEndQuotes End If Else strContentType = strStartQuotes & "RecContentType=text/html" & strEndQuotes End If 'remaining all strMisc = strStartQuotes & "TargetFrame=" & strEndQuotes & _ strStartQuotes & "Resource=0" & strEndQuotes & _ strStartQuotes & "Referer=" & strEndQuotes & _ strStartQuotes & "Mode=HTML" & strEndQuotes & _ strStartQuotes & "Snapshot=t" & intStepNum & ".inf" & strEndQuotes End Sub