|
byteme
Read/write bytes from various sources
|
This library implements a few functors to read buffered inputs from uncompressed or Gzip-compressed files or buffers. Classes can be exchanged at compile- or run-time to easily re-use the same code across different input sources. The aim is to consolidate some common boilerplate across several projects, e.g., tatami, singlepp. Interfacing with Zlib is particularly fiddly and I don't want to be forced to remember how to do it in each project.
To read bytes, create an instance of the desired Reader class and loop until no bytes remain in the source.
To write bytes, create the desired Writer class and supply an array of bytes until completion.
More details can be found in the reference documentation.
For the readers:
| Class | Description |
|---|---|
RawBufferReader | Read from a uncompressed buffer |
RawFileReader | Read from an uncompressed file |
ZlibBufferReader | Read from a Zlib-compressed buffer |
GzipFileReader | Read from an Gzip-compressed file |
IstreamReader | Read from a std::istream |
For the writers:
| Class | Description |
|---|---|
RawBufferWriter | Write to a uncompressed buffer |
RawFileWriter | Write to an uncompressed file |
ZlibBufferWriter | Write to a Zlib-compressed buffer |
GzipFileWriter | Write to an Gzip-compressed file |
OstreamWriter | Write to a std::ostream |
The different subclasses can be switched at compile time via templating, or at run-time by exploiting the class hierarchy:
Most of the Reader and Writer constructors will also accept a matching Options instance to fine-tune their behavior.
Some applications need to access small chunks or individual bytes from the input stream. Calling Reader::read() for each request could be too expensive, e.g., if each call makes some attempt to access a storage device. In such cases, users can create a BufferedReader class to wrap each Reader. This will read a large chunk into a buffer from which smaller chunks or individual bytes can be extracted.
We can also extract a range of bytes:
We can even perform the reading in a separate thread via the ParallelBufferedReader class. This allows the (possibly expensive) disk IO operations to be performed in parallel to the user-level parsing.
Similarly, BufferedWriter will cache all write requests into a large buffer, intermittently calling Writer::write() to push the buffered bytes to the underlying storage.
FetchContentIf you're using CMake, you just need to add something like this to your CMakeLists.txt:
Then you can link to byteme to make the headers available during compilation:
find_package()You can install the library by cloning a suitable version of this repository and running the following commands:
Then you can use find_package() as usual:
If you're not using CMake, the simple approach is to just copy the files the include/ subdirectory - either directly or with Git submodules - and include their path during compilation with, e.g., GCC's -I.
To support Gzip-compressed files, we also need to link to Zlib. When using CMake, byteme will automatically attempt to use find_package() to find the system Zlib. If no Zlib is found, it is skipped and no Gzip functionality is provided by the libary. Users can also set the BYTEME_FIND_ZLIB option to OFF to provide their own Zlib.
I thought about using C++ streams, much like how the zstr library handles Gzip (de)compression. However, I'm not very knowledgeable about the std::istream interface, so I decided to go with something simpler. Just in case, I did add a byteme::IstreamReader class so that byteme clients can easily leverage custom streams.