关于Java：从ZipInputStream读取到ByteArrayOutputStream

Reading from a ZipInputStream into a ByteArrayOutputStream

我正在尝试从java.util.zip.ZipInputStream读取单个文件，并将其复制到java.io.ByteArrayOutputStream(这样我便可以创建java.io.ByteArrayInputStream并将其交给第三方库，该库最终将关闭流，并且我不希望我的ZipInputStream关闭)。

我可能在这里错过了一些基本的东西，但是我从来没有在这里进入while循环：

1
2
3
4
5
6
7
8
9
10

ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
try {
while ((bytesRead = zipStream.read(tempBuffer)) != -1) {
streamBuilder.write(tempBuffer, 0, bytesRead);
}
} catch (IOException e) {
// ...
}

我缺少什么让我可以复制信息流？

编辑：

我应该早先提到过，这个ZipInputStream不是来自文件，所以我认为我不能使用ZipFile。它来自通过Servlet上传的文件。

另外，在进入此代码段之前，我已经在ZipInputStream上调用了getNextEntry()。如果我不尝试将文件复制到另一个InputStream(通过上述的OutputStream)，而只是将ZipInputStream传递给我的第3方库，则该库将关闭流，而我无法执行任何操作更多，例如处理流中剩余的文件。

您的循环看起来有效-以下代码(仅凭其返回)返回什么？

1	zipStream.read(tempBuffer)

如果返回-1，则在获取zipStream之前将其关闭，并且所有投注均已关闭。现在是时候使用调试器，并确保传递给您的内容实际上是有效的。

当您调用getNextEntry()时，它是否返回值，并且条目中的数据是否有意义(即getCompressedSize()是否返回有效值)？如果您只是在读取未嵌入预读zip条目的Zip文件，则ZipInputStream将无法为您工作。

有关Zip格式的一些有用的花絮：

嵌入zip文件中的每个文件都有一个标头。该标头可以包含有用的信息(例如流的压缩长度，它在文件中的偏移量，CRC)-或可以包含一些魔术值，这些魔术值基本上说"信息不在流标头中，您必须检查邮编后序"。

然后，每个zip文件都有一个表，该表附加到文件末尾，该表包含所有zip条目以及实际数据。最后的表是必填项，其中的值必须正确。相反，不必提供嵌入流中的值。

如果使用ZipFile，它将读取zip末尾的表。如果使用ZipInputStream，我怀疑getNextEntry()会尝试使用流中嵌入的条目。如果未指定这些值，则ZipInputStream不知道该流可能有多长时间。膨胀算法是自终止的(实际上，您不需要知道输出流的未压缩长度即可完全恢复输出)，但是该阅读器的Java版本可能无法很好地处理这种情况。

我要说的是，有一个servlet返回ZipInputStream是非常不寻常的(如果您要接收压缩的内容，则接收inflatorInputStream更为常见。

您可能试图像这样从FileInputStream读取：

1	ZipInputStream in = new ZipInputStream(new FileInputStream(...));

由于zip归档文件可能包含多个文件，因此您将无法使用此文件，并且您需要指定要读取的文件。

您可以使用java.util.zip.ZipFile和诸如Apache Commons IO的IOUtils或Guava的ByteStreams之类的库来帮助您复制流。

例：

1
2
3
4
5
6
7
8

ByteArrayOutputStream out = new ByteArrayOutputStream();
try (ZipFile zipFile = new ZipFile("foo.zip")) {
ZipEntry zipEntry = zipFile.getEntry("fileInTheZip.txt");

try (InputStream in = zipFile.getInputStream(zipEntry)) {
IOUtils.copy(in, out);
}
}

你错过了电话

ZipEntry条目=(ZipEntry)zipStream.getNextEntry();

放置第一个条目解压缩的第一个字节。

1
2
3
4
5
6
7
8
9
10
11

ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
ZipEntry entry = (ZipEntry) zipStream.getNextEntry();
try {
while ( (bytesRead = zipStream.read(tempBuffer)) != -1 ){
streamBuilder.write(tempBuffer, 0, bytesRead);
}
} catch (IOException e) {
...
}

我会使用commons io项目中的IOUtils。

1	IOUtils.copy(zipStream, byteArrayOutputStream);

您可以围绕ZipInputStream实现自己的包装程序，该包装程序将忽略close()并将其交给第三方库。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

thirdPartyLib.handleZipData(new CloseIgnoringInputStream(zipStream));

class CloseIgnoringInputStream extends InputStream
{
private ZipInputStream stream;

public CloseIgnoringInputStream(ZipInputStream inStream)
{
stream = inStream;
}

public int read() throws IOException {
return stream.read();
}

public void close()
{
//ignore
}

public void reallyClose() throws IOException
{
stream.close();
}
}

我会在ZipInputStream上调用getNextEntry()，直到它位于所需的条目为止(使用ZipEntry.getName()等)。调用getNextEntry()会将"光标"前进到它返回的条目的开头。然后，使用ZipEntry.getSize()确定应使用zipInputStream.read()读取多少个字节。

请尝试以下代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

private static byte[] getZipArchiveContent(File zipName) throws WorkflowServiceBusinessException {

BufferedInputStream buffer = null;
FileInputStream fileStream = null;
ByteArrayOutputStream byteOut = null;
byte data[] = new byte[BUFFER];

try {
try {
fileStream = new FileInputStream(zipName);
buffer = new BufferedInputStream(fileStream);
byteOut = new ByteArrayOutputStream();

int count;
while((count = buffer.read(data, 0, BUFFER)) != -1) {
byteOut.write(data, 0, count);
}
} catch(Exception e) {
throw new WorkflowServiceBusinessException(e.getMessage(), e);
} finally {
if(null != fileStream) {
fileStream.close();
}
if(null != buffer) {
buffer.close();
}
if(null != byteOut) {
byteOut.close();
}
}
} catch(Exception e) {
throw new WorkflowServiceBusinessException(e.getMessage(), e);
}
return byteOut.toByteArray();

}

尚不清楚您如何获得zipStream。如下所示，它应该可以工作：