Recently, I was having a somewhat heated discussion with a friend about the Java I/O library (specificially java.io.*). His position was that the library is unnecessarily cluttered and verbose, and that I/O in C is much simpler and more productive. Whilst I agreed with some of that, I also argued that the Java I/O library is powerful and more flexible than C. Here are some of the main points we covered:
- Abstracting stream encodings away from stream processing. The ability for objects in Java to delegate to an
InputStreamorOutputStreamprovides a very nice way to decouple the encoding of information from the processing of it. An interesting example in the Whiley compiler is that of name mangling. That is, encoding the Whiley type of a function into its name. To generate the mangling, I have anOutputStreamimplementation which takes a Whiley type and serialises it into binary data. Then, a second implementation ofOutputStreamtakes an arbitrary stream of binary data and encodes it into (roughly speaking) 7-bit ASCII — in other words, it turns the serialised data into a form that can safely be used as part of a method name according to the JVM Spec. And, of course, I have the mirror of this for reading the type out of a mangling. - Stateful encoders/decoders are important. One idea we discussed was that encoding could be handled if: (1) Java had support for lambdas; and (2) classes like E.g.
FileReaderaccepted a decoding (lambda) function. This would cover a large number of use cases, and conceptually simplify the I/O framework. However, we would need our decoding functions to have state to be useful (and I don’t believe that JSR 335 supports this). State is necessary typically in situations where we can — or want — to read chunks larger than necessary. For example, when performing some kind of buffering we read a large chunk and cache it for later. Other examples of stateful stream components include unusual things, such as providing the ability to insert logging into the pipeline.
At this point, things were all making sense to me. Having worked with C for many years (albeit some time ago now), I’m fairly familiar with how I/O is handled in C. In particular, I/O mostly goes through the FILE* structure (although you can read/write to file descriptors directly if you like). You have to rely on functions like fopen() and popen() to create FILE* instances because we don’t know the internal layout of FILE* and, hence, cannot construct our own instances. So, my first attack on this structure was to say something like:
“How do you create a FILE* instance from a memory buffer?“
In Java, this is relatively easy since we can create e.g. a ByteArrayInputStream and pass that to anything accepting an InputStream. Well, it turns out you can do this in C! I had never heard of it before, but there is a fmemopen() function for exactly this use case.
Undeterred, I countered with:
“How do you create a FILE* instance which automagically encodes/decodes into a user-defined format (e.g. as per my name mangling example above)?”
Again, this is relatively easy to do in Java by providing your own implementation of e.g. InputStream. At this point, there was a long pause (in fact, overnight) in the discussion. The next day, my friend comes back and says: “ah, you obviously haven’t heard of the fopencookie() function then!”. Nope, I hadn’t.
This is an excerpt from the manpage on fopencookie():
The fopencookie() function allows the programmer to create a custom implementation for a standard I/O stream. This implementation can store the stream's data at a location of its own choosing; for example, fopencookie() is used to implement fmemopen(3), which provides a stream interface to data that is stored in a buffer in memory. In order to create a custom stream the programmer must: * Implement four "hook" functions that are used internally by the standard I/O library when performing I/O on the stream. * Define a "cookie" data type, a structure that provides bookkeeping information (e.g., where to store data) used by the aforementioned hook functions. The standard I/O package knows nothing about the contents of this cookie (thus it is typed as void * when passed to fopencookie()), but automatically supplies the cookie as the first argument when calling the hook functions. * Call fopencookie() to open a new stream and associate the cookie and hook functions with that stream. ...
Well, I guess you learn a new thing every day …


Except you’re missing the fact that fmemopen() and fopencookie() are not portable, while the Java code will run everywhere.
Hi Pjmlp,
True, but I guess that’s more about the languages themselves rather than their I/O libraries …
Correction: Java will run everywhere with enough resources for a JVM.
@Shannen
.
Nowadays that’s just about anywhere
As for I/O comparison between Java and C, you should take a look at Java NIO/NIO2. Almost sure you can achieve stateful encoders/decoders as you described with NIO.
Yes, Java NIO/NIO2 has much performance improvement, that could be have a new point to compare C I/O and NIO/NIO2.
@Nenad
About >90% microprocessors of today aren’t able to run a JVM, even a simplified one. If you wan’t to find one just disassemble your keyboard and you’ll find a very simple CPU that has most probably been programmed in C.
You can also brag about Nio/2 but remember it’s mostly an interface to epoll C API.
@Java is good but…
If I disassemble my keyboard I’ll probably find microcontroller, not a microprocessor. That’s whole different area of programming. In that area assembler and C are kings. As for embedded systems, take a look at your phone, tablet, TV. Most likely you’ll find microprocessor capable of running JVM of some sort.
Blog is about differences between I/O libraries in C and Java, and points out nicely that C has some nice functions too. I can brag about NIO/2 because its a fine library. At the end, both NIO and IO libs will have to call OS to execute code. Do you know any major OS that’s not written in C/C++?
This is comparing an archaic unsafe structured language, with minimal native library support, with a modern very fleshed out OOP language with vast library support, which frankly blows C away for safety, features and productivity.
When I first saw the flawed OOP concepts of C++ at university, after C, it was an epiphany for me after the clumsiness of C, especially when I discovered STL! Java later made C/C++ look complicated because it does not need any of the conditional complication, incestuous and conflicting defines, over complicated statement modifiers, and silly header files that C and C++ have; this was quite horrible when I later had to revisit C/C++! Java soft-linking really makes C/C++ look primitive, especially when combined with Maven for builds, rather than make.
The OOP elements of Java like interfaces and base classes, and stable data types make coding so much easier for binary and especially Unicode character stream coding, this alone makes this discussion about brittle _static_ hook functions for obscure C functions quite laughable. Static conceits poison the C lib and makes it a joke for many 21st century programs.
The java.io.*, java.util.*, and java.text.* packages have encapsulated, replaced and extended the whole idea of separate piped *NIX filter program processes; these are no longer constrained by the backwards static state and static hook nonsense in C. These extendable building blocks allow for much faster and more flexible coding, and can easily support multi-processing without the need for heavyweight processes.
Re: About >90% microprocessors of today aren’t able to run a JVM, even a simplified one:
BS, primitive embedded hardware is irrelevant, and even that often uses C++, rather than C. Many portable devices run Android now, and Android runs a variant of Java, including OpenJava libraries, for most of its applications. I’d bet the Objective-C support for I/O in Apple iOS also makes a mockery of C I/O.
Major OS kernels are written in C++, not C, specifically because the C has no sensible multi-processing support; Kernels and devices drivers are the rare places where it can makes sense spending the significant extra time writing and debugging in languages like C/C++; however Microsoft Update Tuesday demonstrates just how unsafe the C/C++ can be compared to VM based OOP languages!
Windows continues to be primarily C/C++(i.e., C with namespaces) as I recall. Linux/*BSD/OSX are all C. I can’t think of a single ‘Major OS’ kernel that can be said the be entirely or even mostly C++. Please correct me if I’m wrong, but all of these unarguably kernels seem to be doing just fine in a language that had no ‘sensible multiprocessing’ support. What ‘major kernels’ are you referring to?
If recent Oracle JVM issues are any indication, this comparison is becoming less and less true, isn’t it? I won’t argue the safety of VM based languages in theory, but in practice and given enough time, they seem to lose a little bit of that safety. Granted, I’m referring to the VM implementation and not the language.
The article is great. That random detail about the fopencookie slays me.
Sadly I feel the comments are off in the woods.
Warren
Glad you liked it Warren!