Skip to content Skip to sidebar Skip to footer

Possible To Decompress Bz2 In Python To A File Instead Of Memory

I've worked with decompressing and reading files on the fly in memory with the bz2 library. However, i've read through the documentation and can't seem to just simply decompress t

Solution 1:

You could use the bz2.BZ2File object which provides a transparent file-like handle.

(edit: you seem to use that already, but don't use readlines() on a binary file, or on a text file because in your case the block size isn't big enough which explains why it's slow)

Then use shutil.copyfileobj to copy to the write handle of your output file (you can adjust block size if you can afford the memory)

import bz2,shutil

with bz2.BZ2File("file.bz2") as fr, open("output.bin","wb") as fw:
    shutil.copyfileobj(fr,fw)

Even if the file is big, it doesn't take more memory than the block size. Adjust the block size like this:

shutil.copyfileobj(fr,fw,length = 1000000)  # read by 1MB chunks

Solution 2:

For smaller files that you can store in memory before you save to a file, you can use bz2.open to decompress the file and save it as an uncompressed new file.

import bz2

#decompress data
with bz2.open('compressed_file.bz2', 'rb') as f:
    uncompressed_content = f.read()

#store decompressed file
with open('new_uncompressed_file.dat', 'wb') as f:
   f.write(uncompressed_content)
   f.close()

Post a Comment for "Possible To Decompress Bz2 In Python To A File Instead Of Memory"