How to Draw Beautiful Chart like Matplotlib in Java or Scala?
Sho Nakamura
Posted on May 11, 2021
I am sure many of you have experienced that when you try to do machine learning with Java or Scala, there is no cool graphing tool unlike that Python has Matplotlib. Did you wonder if you could draw Matplotlib chart with Java..?
Matplotlib4j is a library which gives you the power!
From here I will introduce Java examples. Of course, it can also be used from other JVM languages such as Scala and Kotlin. The examples will be described later.
First, add Matplotlib4j to the Java project where you want to use Matplotlib.
The usage is similar to Matplotlib's API, so we can write it intuitively. First, create a Plot object, call the pyplot method on it to add an arbitrary graph, and finally call the show() method; since it is a Builder pattern, we will add options behind it using IDE completion.
With the above Java code, we can draw the following graph.
Some Numpy methods, such as linspace and meshgrid, have been prepared as NumpyUtils classes to help with graph drawing. The first block generates the x and y data for plotting. Here we give a random value to the sin curve. After that, we create a plot object, add the generated x and y data to the plot() method, and call show() at the end to draw the graph.
This is almost equivalent to the Python implementation below (almost, because the data generation part of numpy is strictly different). The method calls are similar, making it easy to use in case you are a Pythonista.
Matplotlib4j also supports saving to a file. Saving images to a file would be convenient for use cases that do not have a GUI, such as batch processing of machine learning on a server.
Similar to the original Matplotlib, by using the .savefig() method instead of .show(), the image is saved to a file without popping up a plot window. The only difference is that plt.executeSilently() needs to be called after .savefig(), which is necessary as a termination process since the savefig command can also be a part of a method chain.
Randomrand=newRandom();List<Double>x=IntStream.range(0,1000).mapToObj(i->rand.nextGaussian()).collect(Collectors.toList());Plotplt=Plot.create();plt.hist().add(x).orientation(HistBuilder.Orientation.horizontal);plt.ylim(-5,5);plt.title("histogram");plt.savefig("/tmp/histogram.png").dpi(200);// Necessary to output the fileplt.executeSilently();
This will output an image like the one below.
Switch Python with pyenv, pyenv-virtualenv
To use Matplotlib4j, you need to install Matplotlib with Python environment; by default, Matplotlib4j will use the Python that is in your environment path, but in many cases you may not have Matplotlib installed in the system default Python.
In that case, you can switch to a Python environment with Matplotlib installed, such as Anaconda, using pyenv or pyenv-virtualenv.
To use Python according to the Pyenv environment, specify PythonConfig when creating the Plot object as follows.
When used from Scala, the aforementioned scatter plot example can be written as follows, just by paying attention to the difference of the Boxing/Unboxing numbers and List classes.
In the Tutorial page, you can find more cases step by step in Java, Scala and Kotlin.
Extra
How it All Started
I recently started reading a book of Deep Learning and decided to try to implement it in Scala which I've often touched lately, since it was not interesting to copy the code on the book in Python as it is. I was happy to be able to write it in a functional way in Scala, but when I got to the backpropagation using the steepest descent method, I encountered a situation where the loss was not dropping at all, and I thought, "What's wrong?"
Of course, the common practice to tackle this is to thicken the tests, but I'd like to see what's going on first quickly by displaying a graph like in the book. But found that there are no good graphing tools in Scala... However, implementing the graphing tool in Scala from scratch is too hard... So I decided to use Matplotlib, which is a familiar Python library, as the reason to create the library.
Design
Matplotlib4j calls Matplotlib in a way that generates Python code without using JNI or Jython. Initially, I wanted to implement it using Jython, but it only supported the Python version up to 2.7, and since numpy wasn't supported, the Matplotlib which depends on it wouldn't work either, so I decided to abandon this path.
There is a library in the world that allow you to use CPython from Java code, and this one was a candidate because we can use both Python3 and numpy. However, we had to install a separate environment-dependent library to use JNI, and we also had to install the library from pip on the Python side, which was too much work for something as simple as drawing graphs. So in the end I have decided to implement it independently of these libraries at all.
Of course, since it is executed via a file, I had to do some tricks in how I pass variables and use return values. Fortunately, since the purpose is only to draw graphs, the basic functions can be satisfied by one-way output to a file, and I think the performance is within the acceptable range with some latency.
Now tutorial is under preparation to walkthrough the features
If you want to skim only the idea of Matplotlib4j, skip that and go to the next section: How to use
How to use
Here is an example. Find more examples on MainTest.java