将csv加载到二维matrix中,绘制为numpy

鉴于这个CSV文件:

"A","B","C","D","E","F","timestamp" 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12 

我只是想加载它作为matrix/ ndarray 3行7列。 然而,由于某种原因,所有我可以摆脱numpy是一个3行(每行一),没有列的ndarray。

 r = np.genfromtxt(fname,delimiter=',',dtype=None, names=True) print r print r.shape [ (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291111964948.0) (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291113113366.0) (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291120650486.0)] (3,) 

我可以手动迭代并将其转换成我想要的形状,但这似乎很愚蠢。 我只是想加载它作为一个适当的matrix,所以我可以切片在不同的维度和绘制,就像在MATLAB中。

纯粹的numpy

 numpy.loadtxt(open("test.csv", "rb"), delimiter=",", skiprows=1) 

查看loadtxt文档。

你也可以使用python的csv模块:

 import csv import numpy reader = csv.reader(open("test.csv", "rb"), delimiter=",") x = list(reader) result = numpy.array(x).astype("float") 

你将不得不把它转换成你最喜欢的数字types。 我想你可以把整件事写在一行中:

 result = numpy.array(list(csv.reader(open(“test.csv”,“rb”),delimiter =“,”)))。astype(“float”)

添加提示:

你也可以使用pandas.io.parsers.read_csv并获得相关的numpy数组,可以更快。

我认为使用dtype地方有一个名字行是混淆例程。 尝试

 >>> r = np.genfromtxt(fname, delimiter=',', names=True) >>> r array([[ 6.11882430e+02, 9.08956010e+03, 5.13300000e+03, 8.64075140e+02, 1.71537476e+03, 7.65227770e+02, 1.29111196e+12], [ 6.11882430e+02, 9.08956010e+03, 5.13300000e+03, 8.64075140e+02, 1.71537476e+03, 7.65227770e+02, 1.29111311e+12], [ 6.11882430e+02, 9.08956010e+03, 5.13300000e+03, 8.64075140e+02, 1.71537476e+03, 7.65227770e+02, 1.29112065e+12]]) >>> r[:,0] # Slice 0'th column array([ 611.88243, 611.88243, 611.88243]) 

您可以使用np.recfromcsv将带有标题的CSV文件读取到NumPylogging数组中 。 例如:

 import numpy as np import StringIO csv_text = """\ "A","B","C","D","E","F","timestamp" 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12 """ # Make a file-like object csv_file = StringIO.StringIO(csv_text) csv_file.seek(0) # Read the CSV file into a Numpy record array r = np.recfromcsv(csv_file, case_sensitive=True) print(repr(r)) 

看起来像这样:

 rec.array([ ( 611.88243, 9089.5601, 5133., 864.07514, 1715.37476, 765.22777, 1.29111196e+12), ( 611.88243, 9089.5601, 5133., 864.07514, 1715.37476, 765.22777, 1.29111311e+12), ( 611.88243, 9089.5601, 5133., 864.07514, 1715.37476, 765.22777, 1.29112065e+12)], dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<f8'), ('D', '<f8'), ('E', '<f8'), ('F', '<f8'), ('timestamp', '<f8')]) 

你可以像这样访问一个命名列r['E']

 array([ 1715.37476, 1715.37476, 1715.37476])