Why does Ice for Java run out of memory when sending large strings?
The memory utilization of the Java Virtual Machine (JVM) is affected by several factors. As the Ice run time prepares to send a protocol message, it constructs a temporary buffer to hold the encoded form of the input parameters. With respect to memory usage, the encoding process is roughly equivalent to making a copy of the parameters; as your parameters grow larger, so does the buffer that Ice needs to encode them.
String parameters are especially problematic in Java, for two reasons. First, the immutable nature of Java's string type forces the Ice run time to allocate more memory and make more copies than should really be necessary. Second, there is a mismatch between Java's native string representation and the Ice encoding of a string: Java strings are composed of 16-bit Unicode characters, whereas Ice encodes strings using an 8-bit format (UTF-8). This discrepancy means the Ice run time must always perform a conversion, which requires additional memory allocation.
As a result, applications that send very large strings can easily exceed the JVM's default maximum heap size. If increasing the JVM's heap limit is not an option, there are some alternative strategies you should consider.
The technique we usually recommend is breaking a large dataset into chunks rather than sending it all at once, as explained in the FAQ, "How do I transfer a file with Ice?". Although it's typically used in file transfer applications, the chunking technique is equally useful for transmitting a large string.
Another solution is to send the data as a sequence of strings rather than a single string. For example, each element of the sequence could represent a line of the string data. As with the chunking approach, the goal is to reduce the maximum length of the strings that the Ice run time must process. You can even combine the chunking and sequence techniques for a further reduction in memory consumption.