使用美丽的方式提取属性值

我试图提取网页上特定“input”标记中的单个“值”属性的内容。我使用下面的代码：

import urllib f = urllib.urlopen("http://58.68.130.147") s = f.read() f.close() from BeautifulSoup import BeautifulStoneSoup soup = BeautifulStoneSoup(s) inputTag = soup.findAll(attrs={"name" : "stainfo"}) output = inputTag['value'] print str(output)

我得到一个TypeError：列表索引必须是整数，而不是str

即使从Beautifulsoup文档我明白，string应该不是一个问题在这里…但ia没有专家，我可能误解了。

任何build议，非常感谢！提前致谢。

.findAll()返回所有find的元素的列表，所以：

 inputTag = soup.findAll(attrs={"name" : "stainfo"})

inputTag是一个列表（可能只包含一个元素）。根据你想要什么，你应该做的：

  output = inputTag[0]['value']

或使用仅返回一个（第一个）find的元素的.find()方法：

  inputTag = soup.find(attrs={"name": "stainfo"}) output = inputTag['value']

如果你想从上面的源代码中获取多个属性值，你可以使用findAll和一个列表理解来获得你需要的所有东西：

 import urllib f = urllib.urlopen("http://58.68.130.147") s = f.read() f.close() from BeautifulSoup import BeautifulStoneSoup soup = BeautifulStoneSoup(s) inputTags = soup.findAll(attrs={"name" : "stainfo"}) ### You may be able to do findAll("input", attrs={"name" : "stainfo"}) output = [x["stainfo"] for x in inputTags] print output ### This will print a list of the values.

实际上，如果您知道哪种标签具有这些属性，我会build议您采用省时的方式。

假设标签xyz有一个名为“staininfo”的attritube ..

 full_tag = soup.findAll("xyz")

我不明白，full_tag是一个列表

 for each_tag in full_tag: staininfo_attrb_value = each_tag["staininfo"] print staininfo_attrb_value

因此，您可以获得所有标签xyz的staininfo的所有attrb值

在Python 3.x ，只需在使用find_all标记对象上使用get(attr_name) find_all ：

 xmlData = None with open('conf//test1.xml', 'r') as xmlFile: xmlData = xmlFile.read() xmlDecoded = xmlData xmlSoup = BeautifulSoup(xmlData, 'html.parser') repElemList = xmlSoup.find_all('repeatingelement') for repElem in repElemList: print("Processing repElem...") repElemID = repElem.get('id') repElemName = repElem.get('name') print("Attribute id = %s" % repElemID) print("Attribute name = %s" % repElemName)

对XML文件conf//test1.xml看起来像这样：

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <root> <singleElement> <subElementX>XYZ</subElementX> </singleElement> <repeatingElement id="11" name="Joe"/> <repeatingElement id="12" name="Mary"/> </root>

打印：

 Processing repElem... Attribute id = 11 Attribute name = Joe Processing repElem... Attribute id = 12 Attribute name = Mary

你也可以使用这个：

 import requests from bs4 import BeautifulSoup import csv url = "http://58.68.130.147/" r = requests.get(url) data = r.text soup = BeautifulSoup(data, "html.parser") get_details = soup.find_all("input", attrs={"name":"stainfo"}) for val in get_details: get_val = val["value"] print(get_val)

使用美丽的方式提取属性值

Swift只读外部读写内部属性

“奇怪的”C＃属性语法

Python类中的属性初始化/声明：在哪里放置它们？

如何在我的自定义视图中使用标准属性android：text？

如何在Python中创build一个只读的类属性？

私人领域与私人财产的区别

C＃，不变性和公共只读字段

如果我不知道名称，如何访问JavaScript对象的属性？

C＃有扩展属性吗？

如何阅读程序集属性