Thursday, January 18, 2024

Generating LARGE JSON Files

The PeopleCode native JsonObject and JsonArray classes allow us to create JSON structures as in-memory representations. But what if you need to generate a really LARGE JSON structure? An in-memory JSON Array may consume more memory than you can reasonably allow. Fortunately, PeopleTools includes the Jakarta JSON library, which allows us to write a JSON structure to a stream during construction.

The following code snippet demonstrates creating 10 million JSON objects in an array without any change in memory consumption. The generated file was 2.5 GB in size, but my memory utilization didn't change the entire time the program ran.

Local JavaObject &Json = GetJavaClass("jakarta.json.Json");
Local JavaObject &writer = CreateJavaObject("", "C:\temp\users-big.json");

Local JavaObject &gen = &Json.createGenerator(&writer);

Local number &iteration = 1;

REM ** 10 million iterations;
Local number &maxIterations = 10000000;


For &iteration = 1 To &maxIterations
   REM ** start person/user object;
   &gen.write("id", "" | &iteration);
   &gen.write("firstName", "John");
   &gen.write("lastName", "Smith");
   REM ** start child address object;
   &gen.write("streetAddress", "21 2nd Street");
   &gen.write("city", "New York");
   &gen.write("state", "NY");
   &gen.write("postalCode", "10021");
   REM ** start phone number array;
   REM ** start home phone object;
   &gen.write("type", "home");
   &gen.write("number", "212 555-1234");
   REM ** start fax number object;
   &gen.write("type", "fax");
   &gen.write("number", "646 555-4567");
   REM ** end array of phone numbers;
   REM ** end person/user object;

REM ** end array;

REM ** cleanup to flush buffers;

The hard-coded values come directly from the Jakarta generator API documentation. In real life, you would replace these values with database data. I converted numbers to strings to simplify the example to avoid Java Reflection.

Are you interested in parsing rather than generating large JSON files? Check out our post on JSON Stream Parsing.

We teach PeopleTools and PeopleCode tips like this every week! Check out our upcoming course schedule to see what we are offering next! We would love to have you join us. Want to learn at your own pace? Check out our subscriptions and on-demand offerings as well. Or do you have a group you would like to train? Contact us for group and quantity discounts.


Gary F said...

Great job. If they would just simplify the reflection thing you'd have easy access to a near infinite library of functionality ;-)

e.g., you just say
javaObject.callmethodwithreflection('method name', 'parameter names', parameter values) or something

and the new javaobject they create just does the reflection that you or I would do by hand.

This seems like a killer app right? Instantly you can use the full excel writer libraries, etc.

Jim Marion said...

Thanks @Gary! We should give the Java reflection angle some thought and see if we can develop some sort of wrapper that makes it less painful.

Robert Chasteen said...

I ran into a similar issue with the PeopleCode JSonBuilder class. My file wasn't large enough to cause issues at build time, but calling the ToString method takes about 45-50 seconds on a 5MB file. Looking for ways to improve this method, because I need to return the serialized JSon in my Rest API call.

Jim Marion said...

@Robert, that is a really good point. The Java alternatives are usually slower than the native JSON PeopleCode objects, but perhaps the toString methods are faster? All said the performance differences may wash out. What about using the Documents module with Document-based messages? Perhaps it can render the JSON faster?