用Java读取纯文本文件

看来有不同的方法来读取和写入Java文件的数据。

我想从文件中读取ASCII数据。什么是可能的方式和差异？

ASCII是一个文本文件，所以你会使用读者阅读。 Java也支持使用InputStreams读取二进制文件。如果正在读取的文件是巨大的，那么你会想在FileReader上使用BufferedReader来提高读取性能。

阅读这篇关于如何使用Reader的文章

我还build议你下载并阅读这本精彩（但免费）的书，名为Thinking In Java

在Java 7中 ：

新的string（Files.readAllBytes（…））或Files.readAllLines（…）

在Java 8中 ：

Files.lines（..）的forEach（…）

我最喜欢读取小文件的方法是使用BufferedReader和StringBuilder。这是非常简单的和重点（虽然不是特别有效，但对大多数情况下足够好）：

 BufferedReader br = new BufferedReader(new FileReader("file.txt")); try { StringBuilder sb = new StringBuilder(); String line = br.readLine(); while (line != null) { sb.append(line); sb.append(System.lineSeparator()); line = br.readLine(); } String everything = sb.toString(); } finally { br.close(); }

有人指出，在Java 7之后，你应该使用try-with-resources （即自动closures）function：

 try(BufferedReader br = new BufferedReader(new FileReader("file.txt"))) { StringBuilder sb = new StringBuilder(); String line = br.readLine(); while (line != null) { sb.append(line); sb.append(System.lineSeparator()); line = br.readLine(); } String everything = sb.toString(); }

当我读取这样的string时，我通常希望每行都进行一些string处理，所以我会去执行这个操作。

虽然如果我只想将文件读入string，我总是使用Apache Commons IO和IOUtils.toString（）方法。你可以在这里看看源代码：

http://www.docjar.com/html/api/org/apache/commons/io/IOUtils.java.html

 FileInputStream inputStream = new FileInputStream("foo.txt"); try { String everything = IOUtils.toString(inputStream); } finally { inputStream.close(); }

Java 7更简单：

 try(FileInputStream inputStream = new FileInputStream("foo.txt")) { String everything = IOUtils.toString(inputStream); // do something with everything string }

最简单的方法是使用Java中的Scanner类和FileReader对象。简单的例子：

 Scanner in = new Scanner(new FileReader("filename.txt"));

Scanner有几种读取string，数字等的方法…您可以在Java文档页面上查找更多信息。

例如将整个内容读入一个String ：

 StringBuilder sb = new StringBuilder(); while(in.hasNext()) { sb.append(in.next()); } in.close(); outString = sb.toString();

另外如果你需要一个特定的编码，你可以使用这个而不是FileReader ：

 new InputStreamReader(new FileInputStream(fileUtf8), StandardCharsets.UTF_8)

这是另一种不使用外部库的方法：

 import java.io.File; import java.io.FileReader; import java.io.IOException; public String readFile(String filename) { String content = null; File file = new File(filename); //for ex foo.txt FileReader reader = null; try { reader = new FileReader(file); char[] chars = new char[(int) file.length()]; reader.read(chars); content = new String(chars); reader.close(); } catch (IOException e) { e.printStackTrace(); } finally { if(reader !=null){reader.close();} } return content; }

这是一个简单的解决scheme：

 String content; content = new String(Files.readAllBytes(Paths.get("sample.txt")));

org.apache.commons.io.FileUtils的方法也可能非常方便，例如：

 /** * Reads the contents of a file line by line to a List * of Strings using the default encoding for the VM. */ static List readLines(File file)

你想怎么处理文本？文件是否足够小以适应内存？我会尽量find最简单的方法来处理您的需求的文件。 FileUtils库是非常处理这个。

 for(String line: FileUtils.readLines("my-text-file")) System.out.println(line);

我不得不基准不同的方式。我将评论我的发现，但总之，最快的方法是在FileInputStream上使用普通的旧BufferedInputStream。如果许多文件必须被读取，那么三个线程将总执行时间减less到大约一半，但是增加更多的线程会逐渐降低性能，直到用20个线程完成比使用一个线程花费的时间更长。

假设你必须阅读一个文件，并对其内容做一些有意义的事情。在这里的例子是从日志中读取行数，并对包含超过某个阈值的值进行计数。所以我假设Files.lines(Paths.get("/path/to/file.txt")).map(line -> line.split(";")) Java 8 Files.lines(Paths.get("/path/to/file.txt")).map(line -> line.split(";"))不是一个选项。

我testing了Java 1.8，Windows 7以及SSD和HDD驱动器。

我写了六个不同的实现：

rawParse ：通过FileInputStream使用BufferedInputStream，然后剪切逐行读取的行。这比其他任何单线程方法都要好，但是对于非ASCII文件来说可能是非常不方便的。

lineReaderParse ：通过FileReader使用BufferedReader，逐行读取，通过调用String.split（）拆分行。这比rawParse慢大约20％。

lineReaderParseParallel ：这与lineReaderParse相同，但它使用多个线程。这是所有情况下最快的select。

nioFilesParse ：使用java.nio.files.Files.lines（）

nioAsyncParse ：使用带有完成处理程序和线程池的AsynchronousFileChannel。

nioMemoryMappedParse ：使用内存映射文件。这是一个糟糕的主意，执行时间至less比任何其他实现长三倍。

这是读取四核i7和SSD驱动器上每个4 MB的204个文件的平均时间。这些文件即时生成以避免磁盘caching。

 rawParse 11.10 sec lineReaderParse 13.86 sec lineReaderParseParallel 6.00 sec nioFilesParse 13.52 sec nioAsyncParse 16.06 sec nioMemoryMappedParse 37.68 sec

我发现，在运行SSD或SSD硬盘驱动器之间，差距比我预期的要小大约快15％。这可能是因为这些文件是在未经整理的硬盘上生成的，并且是按顺序读取的，因此旋转驱动器几乎可以像SSD一样运行。

我对nioAsyncParse实现的低性能感到惊讶。要么我以错误的方式实现了某些东西，要么使用NIO的multithreading实现和完成处理程序与使用java.io API的单线程实现相同（甚至更糟）。此外，使用CompletionHandler进行asynchronous分析的代码行比在旧数据stream中直接执行的代码长得多，而且执行起来比较棘手。

现在这六个实现后面跟着一个包含它们的类加上一个可参数化的main（）方法，它允许玩文件数量，文件大小和并发度。请注意，文件的大小是正负20％。这是为了避免由于所有文件大小完全相同而造成的影响。

rawParse

 public void rawParse(final String targetDir, final int numberOfFiles) throws IOException, ParseException { overrunCount = 0; final int dl = (int) ';'; StringBuffer lineBuffer = new StringBuffer(1024); for (int f=0; f<numberOfFiles; f++) { File fl = new File(targetDir+filenamePreffix+String.valueOf(f)+".txt"); FileInputStream fin = new FileInputStream(fl); BufferedInputStream bin = new BufferedInputStream(fin); int character; while((character=bin.read())!=-1) { if (character==dl) { // Here is where something is done with each line doSomethingWithRawLine(lineBuffer.toString()); lineBuffer.setLength(0); } else { lineBuffer.append((char) character); } } bin.close(); fin.close(); } } public final void doSomethingWithRawLine(String line) throws ParseException { // What to do for each line int fieldNumber = 0; final int len = line.length(); StringBuffer fieldBuffer = new StringBuffer(256); for (int charPos=0; charPos<len; charPos++) { char c = line.charAt(charPos); if (c==DL0) { String fieldValue = fieldBuffer.toString(); if (fieldValue.length()>0) { switch (fieldNumber) { case 0: Date dt = fmt.parse(fieldValue); fieldNumber++; break; case 1: double d = Double.parseDouble(fieldValue); fieldNumber++; break; case 2: int t = Integer.parseInt(fieldValue); fieldNumber++; break; case 3: if (fieldValue.equals("overrun")) overrunCount++; break; } } fieldBuffer.setLength(0); } else { fieldBuffer.append(c); } } }

lineReaderParse

 public void lineReaderParse(final String targetDir, final int numberOfFiles) throws IOException, ParseException { String line; for (int f=0; f<numberOfFiles; f++) { File fl = new File(targetDir+filenamePreffix+String.valueOf(f)+".txt"); FileReader frd = new FileReader(fl); BufferedReader brd = new BufferedReader(frd); while ((line=brd.readLine())!=null) doSomethingWithLine(line); brd.close(); frd.close(); } } public final void doSomethingWithLine(String line) throws ParseException { // Example of what to do for each line String[] fields = line.split(";"); Date dt = fmt.parse(fields[0]); double d = Double.parseDouble(fields[1]); int t = Integer.parseInt(fields[2]); if (fields[3].equals("overrun")) overrunCount++; }

lineReaderParseParallel

 public void lineReaderParseParallel(final String targetDir, final int numberOfFiles, final int degreeOfParalelism) throws IOException, ParseException, InterruptedException { Thread[] pool = new Thread[degreeOfParalelism]; int batchSize = numberOfFiles / degreeOfParalelism; for (int b=0; b<degreeOfParalelism; b++) { pool[b] = new LineReaderParseThread(targetDir, b*batchSize, b*batchSize+b*batchSize); pool[b].start(); } for (int b=0; b<degreeOfParalelism; b++) pool[b].join(); } class LineReaderParseThread extends Thread { private String targetDir; private int fileFrom; private int fileTo; private DateFormat fmt = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); private int overrunCounter = 0; public LineReaderParseThread(String targetDir, int fileFrom, int fileTo) { this.targetDir = targetDir; this.fileFrom = fileFrom; this.fileTo = fileTo; } private void doSomethingWithTheLine(String line) throws ParseException { String[] fields = line.split(DL); Date dt = fmt.parse(fields[0]); double d = Double.parseDouble(fields[1]); int t = Integer.parseInt(fields[2]); if (fields[3].equals("overrun")) overrunCounter++; } @Override public void run() { String line; for (int f=fileFrom; f<fileTo; f++) { File fl = new File(targetDir+filenamePreffix+String.valueOf(f)+".txt"); try { FileReader frd = new FileReader(fl); BufferedReader brd = new BufferedReader(frd); while ((line=brd.readLine())!=null) { doSomethingWithTheLine(line); } brd.close(); frd.close(); } catch (IOException | ParseException ioe) { } } } }

nioFilesParse

 public void nioFilesParse(final String targetDir, final int numberOfFiles) throws IOException, ParseException { for (int f=0; f<numberOfFiles; f++) { Path ph = Paths.get(targetDir+filenamePreffix+String.valueOf(f)+".txt"); Consumer<String> action = new LineConsumer(); Stream<String> lines = Files.lines(ph); lines.forEach(action); lines.close(); } } class LineConsumer implements Consumer<String> { @Override public void accept(String line) { // What to do for each line String[] fields = line.split(DL); if (fields.length>1) { try { Date dt = fmt.parse(fields[0]); } catch (ParseException e) { } double d = Double.parseDouble(fields[1]); int t = Integer.parseInt(fields[2]); if (fields[3].equals("overrun")) overrunCount++; } } }

nioAsyncParse

 public void nioAsyncParse(final String targetDir, final int numberOfFiles, final int numberOfThreads, final int bufferSize) throws IOException, ParseException, InterruptedException { ScheduledThreadPoolExecutor pool = new ScheduledThreadPoolExecutor(numberOfThreads); ConcurrentLinkedQueue<ByteBuffer> byteBuffers = new ConcurrentLinkedQueue<ByteBuffer>(); for (int b=0; b<numberOfThreads; b++) byteBuffers.add(ByteBuffer.allocate(bufferSize)); for (int f=0; f<numberOfFiles; f++) { consumerThreads.acquire(); String fileName = targetDir+filenamePreffix+String.valueOf(f)+".txt"; AsynchronousFileChannel channel = AsynchronousFileChannel.open(Paths.get(fileName), EnumSet.of(StandardOpenOption.READ), pool); BufferConsumer consumer = new BufferConsumer(byteBuffers, fileName, bufferSize); channel.read(consumer.buffer(), 0l, channel, consumer); } consumerThreads.acquire(numberOfThreads); } class BufferConsumer implements CompletionHandler<Integer, AsynchronousFileChannel> { private ConcurrentLinkedQueue<ByteBuffer> buffers; private ByteBuffer bytes; private String file; private StringBuffer chars; private int limit; private long position; private DateFormat frmt = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); public BufferConsumer(ConcurrentLinkedQueue<ByteBuffer> byteBuffers, String fileName, int bufferSize) { buffers = byteBuffers; bytes = buffers.poll(); if (bytes==null) bytes = ByteBuffer.allocate(bufferSize); file = fileName; chars = new StringBuffer(bufferSize); frmt = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); limit = bufferSize; position = 0l; } public ByteBuffer buffer() { return bytes; } @Override public synchronized void completed(Integer result, AsynchronousFileChannel channel) { if (result!=-1) { bytes.flip(); final int len = bytes.limit(); int i = 0; try { for (i = 0; i < len; i++) { byte by = bytes.get(); if (by=='\n') { // *** // The code used to process the line goes here chars.setLength(0); } else { chars.append((char) by); } } } catch (Exception x) { System.out.println( "Caught exception " + x.getClass().getName() + " " + x.getMessage() + " i=" + String.valueOf(i) + ", limit=" + String.valueOf(len) + ", position="+String.valueOf(position)); } if (len==limit) { bytes.clear(); position += len; channel.read(bytes, position, channel, this); } else { try { channel.close(); } catch (IOException e) { } consumerThreads.release(); bytes.clear(); buffers.add(bytes); } } else { try { channel.close(); } catch (IOException e) { } consumerThreads.release(); bytes.clear(); buffers.add(bytes); } } @Override public void failed(Throwable e, AsynchronousFileChannel channel) { } };

全部实现全部实现

https://github.com/sergiomt/javaiobenchmark/blob/master/FileReadBenchmark.java

这里有三个工作和testing的方法：

使用`BufferedReader`

 package io; import java.io.*; public class ReadFromFile2 { public static void main(String[] args)throws Exception { File file = new File("C:\\Users\\pankaj\\Desktop\\test.java"); BufferedReader br = new BufferedReader(new FileReader(file)); String st; while((st=br.readLine()) != null){ System.out.println(st); } } }

使用`Scanner`

 package io; import java.io.File; import java.util.Scanner; public class ReadFromFileUsingScanner { public static void main(String[] args) throws Exception { File file = new File("C:\\Users\\pankaj\\Desktop\\test.java"); Scanner sc = new Scanner(file); while(sc.hasNextLine()){ System.out.println(sc.nextLine()); } } }

使用`FileReader`

 package io; import java.io.*; public class ReadingFromFile { public static void main(String[] args) throws Exception { FileReader fr = new FileReader("C:\\Users\\pankaj\\Desktop\\test.java"); int i; while ((i=fr.read()) != -1){ System.out.print((char) i); } } }

使用`Scanner`类读取整个文件，不用循环

 package io; import java.io.File; import java.io.FileNotFoundException; import java.util.Scanner; public class ReadingEntireFileWithoutLoop { public static void main(String[] args) throws FileNotFoundException { File file = new File("C:\\Users\\pankaj\\Desktop\\test.java"); Scanner sc = new Scanner(file); sc.useDelimiter("\\Z"); System.out.println(sc.next()); } }

使用BufferedReader：

 import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; BufferedReader br; try { br = new BufferedReader(new FileReader("/fileToRead.txt")); try { String x; while ( (x = br.readLine()) != null ) { // Printing out each line in the file System.out.println(x); } } catch (IOException e) { e.printStackTrace(); } } catch (FileNotFoundException e) { System.out.println(e); e.printStackTrace(); }

下面是以Java 8方式进行的单线程。假设text.txt文件位于Eclipse项目目录的根目录下。

 Files.lines(Paths.get("text.txt")).collect(Collectors.toList());

这基本上和耶稣拉莫斯的答案完全一样，除了使用File而不是FileReader和迭代来遍历文件的内容。

 Scanner in = new Scanner(new File("filename.txt")); while (in.hasNext()) { // Iterates each line in the file String line = in.nextLine(); // Do something with line } in.close(); // Don't forget to close resource leaks

抛出FileNotFoundException

可能不如缓冲I / O速度快，但非常简洁：

  String content; try (Scanner scanner = new Scanner(textFile).useDelimiter("\\Z")) { content = scanner.next(); }

\Z模式告诉Scanner分隔符是EOF。

到目前为止，我还没有看到在其他答案中提到它。但是，如果“最佳”意味着速度，那么新的Java I / O（NIO）可能会提供最快的性能，但并不总是最容易被人们学习的。

http://download.oracle.com/javase/tutorial/essential/io/file.html

从Java文件中读取数据最简单的方法是利用File类读取文件，然后使用Scanner类读取文件的内容。

 public static void main(String args[])throws Exception { File f = new File("input.txt"); takeInputIn2DArray(f); } public static void takeInputIn2DArray(File f) throws Exception { Scanner s = new Scanner(f); int a[][] = new int[20][20]; for(int i=0; i<20; i++) { for(int j=0; j<20; j++) { a[i][j] = s.nextInt(); } } }

PS：别忘了导入java.util。*; 让扫描仪工作。

对于基于JSF的Maven Web应用程序，只需使用ClassLoader和Resources文件夹来读取任何你想要的文件：

将您想要读取的任何文件放在资源文件夹中。

将Apache Commons IO依赖项放到您的POM中：

 <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-io</artifactId> <version>1.3.2</version> </dependency>

使用下面的代码来阅读它（例如，下面是阅读一个.json文件）：

 String metadata = null; FileInputStream inputStream; try { ClassLoader loader = Thread.currentThread().getContextClassLoader(); inputStream = (FileInputStream) loader .getResourceAsStream("/metadata.json"); metadata = IOUtils.toString(inputStream); inputStream.close(); } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } return metadata;

您可以对文本文件，.properties文件， XSD架构等执行相同的操作。

番石榴提供了一个这样的单线：

 import com.google.common.base.Charsets; import com.google.common.io.Files; String contents = Files.toString(filePath, Charsets.UTF_8);

这可能不是问题的确切答案，它只是读取文件的另一种方式，您不必在java代码中明确指定文件的path，而是将其作为命令行参数读取。

用下面的代码：

 import java.io.BufferedReader; import java.io.InputStreamReader; import java.io.IOException; public class InputReader{ public static void main(String[] args)throws IOException{ BufferedReader br = new BufferedReader(new InputStreamReader(System.in)); String s=""; while((s=br.readLine())!=null){ System.out.println(s); } } }

只要继续运行

 java InputReader < input.txt

这将读取input.txt的内容并将其打印到您的控制台。

你也可以使你的System.out.println()通过命令行写入一个特定的文件，如下所示：

 java InputReader < input.txt > output.txt

这将从input.txt读取并写入output.txt

Cactoos给你一个声明式的单行：

 new TextOf(new File("a.txt")).asString();

这是一个简单的解决scheme：

 String content = new String(java.nio.file.Files.readAllBytes( java.nio.file.Paths.get("sample.txt")));

如果这是关于简单的结构使用Java的吻：

 import static kiss.API.*; class App { void run() { String line; try (Close in = inOpen("file.dat")) { while ((line = readLine()) != null) { println(line); } } } }

我编写的代码对于非常大的文件要快得多：

 public String readDoc(File f) { String text = ""; int read, N = 1024 * 1024; char[] buffer = new char[N]; try { FileReader fr = new FileReader(f); BufferedReader br = new BufferedReader(fr); while(true) { read = br.read(buffer, 0, N); text += new String(buffer, 0, read); if(read < N) { break; } } } catch(Exception ex) { ex.printStackTrace(); } return text; }

用Java读取纯文本文件

使用`BufferedReader`

使用`Scanner`

使用`FileReader`

使用`Scanner`类读取整个文件，不用循环

如何在Python中将UTF-8编码的文本打印到控制台<3？

非ASCII字符的SyntaxError

我们为什么要使用Base64？

ASCII码是7位还是8位？

如何统计C中的Unicodestring中的字符

`＆mdash;`或`＆quot; HTML输出是否有区别？

提醒 – \ r \ n或\ n \ r？

如何将一列ascii值转换为python中的string？

在Java中将hexstring转换为ASCII

将int转换为ASCII字符

用Java读取纯文本文件

使用BufferedReader

使用Scanner

使用FileReader

使用Scanner类读取整个文件，不用循环

如何在Python中将UTF-8编码的文本打印到控制台<3？

非ASCII字符的SyntaxError

我们为什么要使用Base64？

ASCII码是7位还是8位？

如何统计C中的Unicodestring中的字符

`＆mdash;`或`＆quot; HTML输出是否有区别？

提醒 – \ r \ n或\ n \ r？

如何将一列ascii值转换为python中的string？

在Java中将hexstring转换为ASCII

将int转换为ASCII字符

使用`BufferedReader`

使用`Scanner`

使用`FileReader`

使用`Scanner`类读取整个文件，不用循环