带有InputStream长度示例的AmazonS3 putObject

我正在上传一个文件到S3使用Java – 这是我到目前为止:

AmazonS3 s3 = new AmazonS3Client(new BasicAWSCredentials("XX","YY")); List<Bucket> buckets = s3.listBuckets(); s3.putObject(new PutObjectRequest(buckets.get(0).getName(), fileName, stream, new ObjectMetadata())); 

文件正在上传,但当我不设置内容长度时会引发警告:

 com.amazonaws.services.s3.AmazonS3Client putObject: No content length specified for stream > data. Stream contents will be buffered in memory and could result in out of memory errors. 

这是我正在上传的文件, streamvariables是一个InputStream ,从中我可以得到像这样的字节数组: IOUtils.toByteArray(stream)

所以当我尝试设置内容长度和MD5(取自这里 )这样的:

 // get MD5 base64 hash MessageDigest messageDigest = MessageDigest.getInstance("MD5"); messageDigest.reset(); messageDigest.update(IOUtils.toByteArray(stream)); byte[] resultByte = messageDigest.digest(); String hashtext = new String(Hex.encodeHex(resultByte)); ObjectMetadata meta = new ObjectMetadata(); meta.setContentLength(IOUtils.toByteArray(stream).length); meta.setContentMD5(hashtext); 

它导致从S3返回以下错误:

您指定的Content-MD5无效。

我究竟做错了什么?

任何帮助感激!

PS我在Google App Engine上 – 我无法将文件写入磁盘或创build临时文件,因为AppEngine不支持FileOutputStream。

由于原来的问题从来没有得到答案,我不得不遇到这个问题,对于MD5问题的解决scheme是,S3不需要我们通常想到的hex编码的MD5string。

相反,我必须这样做。

 // content is a passed in InputStream byte[] resultByte = DigestUtils.md5(content); String streamMD5 = new String(Base64.encodeBase64(resultByte)); metaData.setContentMD5(streamMD5); 

基本上他们想要的MD5值是Base64编码的原始MD5字节数组,而不是Hexstring。 当我切换到这个,它开始为我工作很好。

如果你所要做的只是从亚马逊解决内容长度错误,那么你可以将inputstream中的字节读入一个Long,并将其添加到元数据中。

 /* * Obtain the Content length of the Input stream for S3 header */ try { InputStream is = event.getFile().getInputstream(); contentBytes = IOUtils.toByteArray(is); } catch (IOException e) { System.err.printf("Failed while reading bytes from %s", e.getMessage()); } Long contentLength = Long.valueOf(contentBytes.length); ObjectMetadata metadata = new ObjectMetadata(); metadata.setContentLength(contentLength); /* * Reobtain the tmp uploaded file as input stream */ InputStream inputStream = event.getFile().getInputstream(); /* * Put the object in S3 */ try { s3client.putObject(new PutObjectRequest(bucketName, keyName, inputStream, metadata)); } catch (AmazonServiceException ase) { System.out.println("Error Message: " + ase.getMessage()); System.out.println("HTTP Status Code: " + ase.getStatusCode()); System.out.println("AWS Error Code: " + ase.getErrorCode()); System.out.println("Error Type: " + ase.getErrorType()); System.out.println("Request ID: " + ase.getRequestId()); } catch (AmazonClientException ace) { System.out.println("Error Message: " + ace.getMessage()); } finally { if (inputStream != null) { inputStream.close(); } } 

你需要使用这个确切的方法读取inputstream两次,所以如果你正在上传一个非常大的文件,你可能需要看看它读入一个数组,然后从那里读取它。

为了上传,S3 SDK有两个putObject方法:

 PutObjectRequest(String bucketName, String key, File file) 

 PutObjectRequest(String bucketName, String key, InputStream input, ObjectMetadata metadata) 

inputstream + ObjectMetadata方法需要inputstream的Content Length的最小元数据。 如果不这样做,那么它会缓冲内存中的信息,这可能会导致OOM。 或者,你可以做你自己的内存缓冲来获得长度,但是你需要得到第二个inputstream。

OP没有被问到(他的环境的限制),而是对于别人,比如我。 我发现它更简单,更安全(如果您有权访问临时文件),将inputstream写入临时文件,并放入临时文件。 没有内存缓冲区,也不需要创build第二个inputstream。

 AmazonS3 s3Service = new AmazonS3Client(awsCredentials); File scratchFile = File.createTempFile("prefix", "suffix"); try { FileUtils.copyInputStreamToFile(inputStream, scratchFile); PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, id, scratchFile); PutObjectResult putObjectResult = s3Service.putObject(putObjectRequest); } finally { if(scratchFile.exists()) { scratchFile.delete(); } } 

写入S3时,需要指定S3对象的长度,以确保没有内存不足错误。

使用IOUtils.toByteArray(stream)也容易出现OOM错误,因为这是由ByteArrayOutputStream

因此,最好的select是首先将inputstream写入本地磁盘上的临时文件,然后使用该文件通过指定临时文件的长度来写入S3。

我实际上做了一些相同的事情,但在我的AWS S3存储上: –

正在接收上传文件的servlet的代码: –

 import java.io.IOException; import java.io.PrintWriter; import java.util.List; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import org.apache.commons.fileupload.FileItem; import org.apache.commons.fileupload.disk.DiskFileItemFactory; import org.apache.commons.fileupload.servlet.ServletFileUpload; import com.src.code.s3.S3FileUploader; public class FileUploadHandler extends HttpServlet { protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { doPost(request, response); } protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { PrintWriter out = response.getWriter(); try{ List<FileItem> multipartfiledata = new ServletFileUpload(new DiskFileItemFactory()).parseRequest(request); //upload to S3 S3FileUploader s3 = new S3FileUploader(); String result = s3.fileUploader(multipartfiledata); out.print(result); } catch(Exception e){ System.out.println(e.getMessage()); } } } 

将这些数据上传为AWS对象的代码: –

 import java.io.ByteArrayInputStream; import java.io.IOException; import java.util.List; import java.util.UUID; import org.apache.commons.fileupload.FileItem; import com.amazonaws.AmazonClientException; import com.amazonaws.AmazonServiceException; import com.amazonaws.auth.ClasspathPropertiesFileCredentialsProvider; import com.amazonaws.services.s3.AmazonS3; import com.amazonaws.services.s3.AmazonS3Client; import com.amazonaws.services.s3.model.ObjectMetadata; import com.amazonaws.services.s3.model.PutObjectRequest; import com.amazonaws.services.s3.model.S3Object; public class S3FileUploader { private static String bucketName = "***NAME OF YOUR BUCKET***"; private static String keyName = "Object-"+UUID.randomUUID(); public String fileUploader(List<FileItem> fileData) throws IOException { AmazonS3 s3 = new AmazonS3Client(new ClasspathPropertiesFileCredentialsProvider()); String result = "Upload unsuccessfull because "; try { S3Object s3Object = new S3Object(); ObjectMetadata omd = new ObjectMetadata(); omd.setContentType(fileData.get(0).getContentType()); omd.setContentLength(fileData.get(0).getSize()); omd.setHeader("filename", fileData.get(0).getName()); ByteArrayInputStream bis = new ByteArrayInputStream(fileData.get(0).get()); s3Object.setObjectContent(bis); s3.putObject(new PutObjectRequest(bucketName, keyName, bis, omd)); s3Object.close(); result = "Uploaded Successfully."; } catch (AmazonServiceException ase) { System.out.println("Caught an AmazonServiceException, which means your request made it to Amazon S3, but was " + "rejected with an error response for some reason."); System.out.println("Error Message: " + ase.getMessage()); System.out.println("HTTP Status Code: " + ase.getStatusCode()); System.out.println("AWS Error Code: " + ase.getErrorCode()); System.out.println("Error Type: " + ase.getErrorType()); System.out.println("Request ID: " + ase.getRequestId()); result = result + ase.getMessage(); } catch (AmazonClientException ace) { System.out.println("Caught an AmazonClientException, which means the client encountered an internal error while " + "trying to communicate with S3, such as not being able to access the network."); result = result + ace.getMessage(); }catch (Exception e) { result = result + e.getMessage(); } return result; } } 

注意: – 我正在使用aws属性文件的凭据。

希望这可以帮助。

我创build了一个在后台使用分段上传的库,以避免缓冲内存中的所有内容,也不写入磁盘: https : //github.com/alexmojaki/s3-stream-upload

添加log4j-1.2.12.jar文件已经解决了我的问题