Possible To Decompress Bz2 In Python To A File Instead Of Memory
I've worked with decompressing and reading files on the fly in memory with the bz2 library. However, i've read through the documentation and can't seem to just simply decompress t
Solution 1:
You could use the bz2.BZ2File
object which provides a transparent file-like handle.
(edit: you seem to use that already, but don't use readlines()
on a binary file, or on a text file because in your case the block size isn't big enough which explains why it's slow)
Then use shutil.copyfileobj
to copy to the write handle of your output file (you can adjust block size if you can afford the memory)
import bz2,shutil
with bz2.BZ2File("file.bz2") as fr, open("output.bin","wb") as fw:
shutil.copyfileobj(fr,fw)
Even if the file is big, it doesn't take more memory than the block size. Adjust the block size like this:
shutil.copyfileobj(fr,fw,length = 1000000) # read by 1MB chunks
Solution 2:
For smaller files that you can store in memory before you save to a file, you can use bz2.open
to decompress the file and save it as an uncompressed new file.
import bz2
#decompress data
with bz2.open('compressed_file.bz2', 'rb') as f:
uncompressed_content = f.read()
#store decompressed file
with open('new_uncompressed_file.dat', 'wb') as f:
f.write(uncompressed_content)
f.close()
Post a Comment for "Possible To Decompress Bz2 In Python To A File Instead Of Memory"