Python 3 - Faster Print & I/o
Solution 1:
If you are indeed I/O bound by the time it takes to write the file, multi-threading with a pool of threads may help. Of course, there is a limit to that, but at least, it would allow you to issue non-blocking file writes.
Solution 2:
Multithreading could speed it up (have printers on other threads that you write to in memory that have a queue).
Maybe a system design standpoint, but maybe evaluate whether or not you need to write everything to the file. Perhaps consider creating various levels of logging so that a release mode could run faster (if that makes sense in your context).
Solution 3:
Use HDF5 file format
The problem is, you have to write a lot of data.
HDF5 is format being very efficient in size and allowing access to it by various tools.
Be prepared for few challenges:
- there are multiple python packages for HDF5, you will have to find the one which fits your needs
- installation is not always very simple (but there might be Windows installation binary)
- expect a bit of study to understand the data structures to be stored.
- it will occasionally need some CPU cycles - typically you write a lot of data quickly and at one moment it has to be flushed to the disk. At this moment it starts compressing the data what can take few seconds. See GIL for IO bounded thread in C extension (HDF5)
Anyway, I think, it is very likely, you will manage and apart of faster writes to the files you will also gain smaller files, which are simpler to handle.
Post a Comment for "Python 3 - Faster Print & I/o"